Methods for diagnosis and treatment of psychiatric disorders

ABSTRACT

The present invention provides methods for the diagnosis and treatment of psychiatric disorders. In particular, the present invention provides convergent functional genomics methods for the identification of candidate genes associated with psychiatric disorders such as mania and psychosis, as well as other multi-faceted diseases and syndromes.

[0001] This invention was made, in part, with Government support by theNational Institutes of Health Grant Number MH47612. The Government hascertain rights in the invention.

FIELD OF THE INVENTION

[0002] The present invention provides methods for the diagnosis andtreatment of psychiatric disorders. In particular, the present inventionprovides convergent functional genomics and other methods for theidentification of candidate genes associated with psychiatric disorderssuch as mania and psychosis, as well as other multi-faceted diseases andsyndromes. In addition, the present invention provides methods andcompositions for the screening and identification of therapeuticcompounds, as well as genetic and protein-based therapies.

BACKGROUND OF THE INVENTION

[0003] In the U.S., major depression ranks first among all causes ofdisability and second after heart disease as a cause of healthy yearslost to premature mortality and disability (See, Hyman and Rudorfer,“Depressive and Bipolar Mood Disorders,” in Dale and Federman (eds.),Scientific American Medicine, Healtheon/WebMD, New York, N.Y. [2000]).Indeed, approximately 10 percent of the population experiences at leastone depressive episode that would benefit from treatment, while 5percent would be classified as having severe and disabling symptoms ofdepression (See, Hyman and Rudorfer, supra).

[0004] While the prevalence of unipolar depression (major depression) inthe U.S. is 5-10 percent, with women having approximately a two-foldgreater risk than men, the prevalence of bipolar disorder(manic-depressive illness) is approximately 1 percent, is less variable,and affects men and women equally (See, Hyman and Rudorfer, Supra).There is a strong familial association for unipolar, as well as bipolardisorder. For example, the familial nature of bipolar disorder isassociated with a 5 to 10-fold increased risk in first-degree relativesabove the 1 percent risk in the general population (See, Hyman andRudorfer, supra). Bipolar disorder often begins in young adulthood(e.g., second or third decade of life), although childhood onset isincreasingly being recognized. Late onset is less common, but can evenoccur in the elderly. In rare cases, patients may have only a singlemanic episode. However, the vast majority of patients have recurrentepisodes of illness, with the rate of cycling between mania anddepression varying widely among individuals, and the episodes becomingmore frequent with age. Between episodes of depression and mania, themajority of patients are symptom-free, although as many as one-third ofpatients exhibit residual symptoms.

[0005] Patients affected by bipolar disorder have had at least one manicor hypomanic (mild mania) episode. However, at the time of diagnosis,they may never have had a depressive episode, according to the DSM-IVcriteria. The diagnosis is supported by family history data andobservational studies. According to the DSM-IV, patients with fullmanias and depression are indicated as having “bipolar I disorder,”while patients with hypomanias and depressions are described as having“bipolar II disorder.” Onset of episodes tends to be acute, withsymptoms developing over days to weeks. The depressive episodes ofbipolar patients are indistinguishable from those of patients withunipolar disorder. Thus, misdiagnosis of bipolar disorder is common.Indeed, as many as 40 percent of bipolar patients are initiallymisdiagnosed (See, Hyman and Rudorfer, supra). It is also not uncommonfor clinicians to misclassify bipolar patients as depressed orschizophrenic on the basis of their mental status. However, it isimportant to make a proper diagnosis, as administration of some drugscan seriously worsen the patient's clinical picture.

[0006] In addition to the problems associated with diagnosis, treatmentof bipolar disorder can be problematic. Indeed, it has been estimatedthat 5 percent of patients experience chronic unremitting symptomsdespite treatment (See, Hyman and Rudorfer, supra). Mania requiresprompt treatment because it can rapidly worsen, resulting in poorjudgment that endangers interpersonal relationships, jobs, and finances.Management is founded upon medication, provision of a low-stimulationenvironment, and protecting the patient from undertaking potentiallyharmful activities. Initial management of acute mania is often bestaccomplished through hospitalization. Thus, the management of bipolardisorder can be expensive, intrusive, and difficult. In addition,despite the now routine use of maintenance treatment for bipolardisorder, up to 90 percent of patients experience at least one relapsewithin 5 years of their original diagnosis (See, Hyman and Rudorfer,supra). Thus, it is clear that improved methods and compositions for thediagnosis and treatment of psychiatric diseases such as bipolar disorderare needed.

SUMMARY OF THE INVENTION

[0007] The present invention provides methods for the diagnosis andtreatment of psychiatric disorders. In particular, the present inventionprovides convergent functional genomics methods for the identificationof candidate genes associated with psychiatric disorders such as maniaand psychosis, as well as other multi-faceted diseases and syndromes. Insome particularly preferred embodiments, the present invention providesmethods and compositions for the diagnosis and prognosis of psychiatricdisorders. In alternative particularly preferred embodiments, thepresent invention provides methods and compositions for screening andidentification of compounds with therapeutic value for treatment ofpsychiatric disorders, including but not limited to bipolar disorder,schizophrenia, schizoaffective disorder, psychosis, depression,stimulant abuse, alcoholism, panic disorder, generalized anxietydisorder, attention deficit disorder, post-traumatic stress disorder,and Parkinson's disease. In addition, the present invention providesmethods and compositions for the prediction and assessment of patientresponses to therapeutic agents, as well as for monitoring patientcondition/response to treatment over time. The present invention furtherprovides genes and proteins associated with psychiatric disorders, aswell as methods and compositions for gene therapy of psychiatricdisorders. In still additional embodiments, the present inventionprovides methods and compositions for protein-based therapy.

[0008] In one embodiment, the present invention provides convergentfunctional genomics methods for the identification of candidate genesassociated with psychiatric disorders. In particularly preferredembodiments, the methods involve determining changes in gene expressionbetween treated and untreated tissues by using a quantitativehybridization assay and oligonucleotide gene chips or microarrays. Insome preferred embodiments, repressed or induced genes are scored asmapping to a psychiatric disorder linkage region if these genes or theirhuman homologues are located within about 10 cM of a putativepsychiatric disorder marker. In one embodiment, treatment consists ofamphetamine administration, while in others treatment consists ofmethamphetamine, cocaine or methylphenidate. This invention is notlimited to these treatments as any other direct or indirect dopamineagonist is suitable for use in the present invention. In a preferredembodiment, the psychiatric disorder of this method is bipolar disorder,which is also known as manic-depressive illness. In related embodiments,the psychiatric disorder is selected from the group consisting ofunipolar depression, major depression, schizophrenia, schizoaffectivedisorder and attention deficit disorder.

[0009] The present invention also provides methods for diagnosingbipolar disorder, identifying individuals at risk for bipolar disorder,and assessing bipolar disorder prognosis by detecting sequence variationin a fragment or fragments of a patient's G protein-coupled receptorkinase 3 (GRK3) gene. In a preferred embodiment, the GRK3 gene fragmentcomprises the promoter. In some embodiments, the sequence variationcomprises a SNP located approximately 1330 bp upstream of the GRK3 startcodon, while in other embodiments, the sequence variation comprises aSNP positioned upstream of the GRK3 start codon at various locationsincluding about 1306 bp, about 1197 bp, about 901 bp, about 383 bp, andabout 110 bp. The present invention also provides methods for predictingtreatment response. In one preferred embodiment, the present inventionprovides methods for predicting a subject's response to anantidepressant, wherein the response is selected from the groupconsisting of hypomania, mania and psychosis.

[0010] The present invention also provides methods for screeningcompounds that alter the expression of psychiatric genes comprising:providing a plurality of cells comprising psychiatric genes, standardmedium, medium containing at least one dopamine agonist, and at leastone test compound; incubating a first aliquot of cells in standardmedium containing at least one test compound; incubating a secondaliquot of cells in medium containing at least one dopamine agonist andat least one test compound; quantitating expression of the psychiatricgenes in the first and second aliquots of cells; and comparingexpression of the psychiatric genes in the first and second aliquots ofcells. In a preferred embodiment, the cells of the invention areneurally derived cells, while in other embodiments, lymphoblastoid celllines or other types of cells find use in the present invention. In someembodiments, quantitation of gene expression is achieved using atechnique selected from the group consisting of Northern blots, RT-PCR,Western blots, enzyme-linked immunosorbent assays, fluorescenceimmunoassays, radioimmunoassays, luciferase assays, fluorescence assays,and flow cytometry. In some preferred embodiments, the psychiatric geneis a psychogene, while in other embodiments it is a psychosis-suppressorgene. In a particularly preferred embodiment, the psychiatric gene isselected from the group consisting of the G protein-coupled receptorkinase 3 (GRK3) gene, the D-box binding protein (DBP) gene, thefarnesyl-diphosphate farnesyltransferase (FDFT1) gene, the vertebrateLIN7 homolog 1 (VELI1) gene, the sulfotransferase 1 A1 (SULT1A1) geneand the insulin-like growth factor 1 (IFG1) gene.

[0011] In particular the present invention provides methods for theidentification of genes associated with psychiatric disorders,comprising the steps of: providing test antisense cRNA and controlantisense cRNA; hybridizing the test antisense cRNA and the controlantisense cRNA to a microarray comprising at least two nucleic acids;measuring the hybridization of the test antisense cRNA and the controlantisense cRNA to the nucleic acids; comparing the hybridization of thetest antisense cRNA with the hybridization of the control antisense cRNAto provide a hybridization score; determining whether the hybridizationscore indicates the test antisense cRNA represents a gene with alteredexpression; and determining whether the gene maps to a psychiatricdisorder linkage region. In preferred embodiments, the identified geneis a human homologue. In another preferred embodiment, the gene maps towithin about 10 cM of a putative marker associated with a psychiatricdisorder, while in another embodiment, the putative marker associatedwith a psychiatric disorder has been identified as such in human geneticstudies. In some embodiments, the gene with altered expression isselected from the group consisting of induced genes and repressed genes.Additionally in some embodiment the microarray comprises at least onegene chip. Moreover, the hybridized test antisense cRNA and the controlantisense cRNA are labelled in some embodiments and the label isselected from the group consisting of fluorescent labels, luminescentlabels, enzyme labels, and radioactive labels. In particularly preferredembodiments, the psychiatric disorder is selected from the groupconsisting of bipolar disorder, manic-depressive illness, unipolardepression, major depression, schizophrenia, schizoaffective disorder,and attention deficit disorder. In the method of the present inventionthe test antisense cRNA is obtained from an animal treated with adopamine agonist and the control antisense cRNA is obtained from ananimal not treated with a dopamine agonist in some embodiments. In otherembodiments, the dopamine agonist is selected from the group consistingof amphetamine, methamphetamine, cocaine and methylphenidate.

[0012] The present invention also provides methods for diagnosingbipolar disorder comprising detecting sequence variation in at least onefragment of a G protein-coupled receptor kinase 3 (GRK3) gene obtainedfrom a subject. In some embodiments, the detecting comprises nucleotidesequencing. In particularly preferred embodiments, the subject is anindividual at risk of developing bipolar disorder. In other preferredembodiments, the fragment of GRK3 gene comprises the promoter.Additionally, the present invention provides methods wherein thesequence variation is selected from the group consisting of: a thymineto cytosine transition at approximately 1330 bp upstream of thetranslation start site of the GRK3 gene; an adenine to guaninetransition at approximately 1306 bp upstream of the translation startsite of the GRK3 gene; a thymine to guanine transversion atapproximately 1197 bp upstream of the translation start site of the GRK3gene; an adenine to guanine transition at approximately 901 bp upstreamof the translation start site of the GRK3 gene; a guanine to adeninetransition at approximately 383 bp upstream of the translation startsite of the GRK3 gene; and a guanine deletion at approximately 110 bpupstream of the translation start site of the GRK3 gene. In onepreferred embodiment, the sequence variation is predictive of asubject's response to an antidepressant, wherein the response isselected from the group consisting of hypomania, mania and psychosis.

[0013] The present invention also provides methods for screeningcompounds that alter expression of at least one psychiatric gene,comprising the steps of: providing a plurality of cells comprisingpsychiatric genes, standard medium, medium containing at least onedopamine agonist, and at least one test compound; incubating a firstaliquot of the cells with the standard medium and the at least one testcompound; incubating a second aliquot of the cells with the mediumcontaining at least one dopamine agonist and the at least one testcompound; quantitating the expression of the psychiatric genes in thefirst aliquot and quantitating the expression of the psychiatric genesin the second aliquot; and comparing the expression of the psychiatricgenes in the first aliquot with the expression of the psychiatric genesin the second aliquot In preferred embodiments, the psychiatric genesare selected from the group consisting of psychogenes andpsychosis-suppressor genes. In some embodiments, the method forquantification is selected from the group consisting of Northern blots,RT-PCR, Western blots, enzyme-linked immunosorbent assays, fluorescenceimmunoassays, radioimmunoassays, luciferase assays, fluorescence assays,and flow cytometry. In particularly preferred embodiments, thepsychiatric genes are selected from the group consisting of the Gprotein-coupled receptor kinase 3 (GRK3) gene, the D-box binding protein(DBP) gene, the famesyl-diphosphate famesyltransferase (FDFT1) gene, thevertebrate LIN7 homolog 1 (VELI1) gene, the sulfotransferase 1 A1(SULTLA1) gene, and the insulin-like growth factor 1 (IFG1) gene.

DESCRIPTION OF THE FIGURES

[0014]FIG. 1 shows the location of the single nucleotide polymorphisms(SNPs) detected upon screening fragments of G protein-coupled receptorkinase 3 (GRK3) genomic DNA from subjects with bipolar disorder.

[0015]FIG. 2 shows the SNPs detected in the 5′ end of the GRK3 gene (SEQID NO:1), relative to the start codon.

[0016]FIG. 3 shows a Western blot of cell lysates from bipolar patientsand normal controls, probed with a GRK3-specific antibody (sc-9306).Similar amounts of protein were run in each lane as confirmed byCoomassie staining of an identical gel (not shown).

[0017]FIG. 4 shows a Western blot of cell lysates from severalbrain-derived cell lines, probed with a GRK3-specific antibody. The celllines used in this study included retinoblastoma (Y-79) cells,neuroblastoma (SK-N-MC) cells, and amygdalar (AR-5) cells.

DESCRIPTION OF THE INVENTION

[0018] The present invention provides methods for the diagnosis andtreatment of psychiatric disorders. In particular, the present inventionprovides convergent functional genomics methods for the identificationof candidate genes associated with psychiatric disorders such as maniaand psychosis, as well as other multi-faceted diseases and syndromes. Inparticularly preferred embodiments, the present invention providesmethods and compositions for the diagnosis and prognosis of psychiatricdisorders. In particularly preferred embodiments, the present inventionprovides methods and compositions for the screening and identificationof compounds with therapeutic value for treatment of psychiatricdisorders, including but not limited to bipolar disorder, schizophrenia,schizoaffective disorder, psychosis, depression, stimulant abuse,alcoholism, panic disorder, generalized anxiety disorder, attentiondeficit disorder, post-traumatic stress disorder, and Parkinson'sdisease. In addition, the present invention provides methods andcompositions for the assessment of patient responses to therapeuticagents, as well as for monitoring patient condition/response totreatment over time. The present invention further provides methods andcompositions for gene therapy of psychiatric disorders. In stilladditional embodiments, the present invention provides methods andcompositions for protein-based therapy. For ease in reading, thefollowing Description of the Invention is divided into several sections:I. Convergent Functional Genomics; II. Psychogenes andPsychosis-Suppressor Genes; and III. Cell Culture Methods.

[0019] I. Convergent Functional Genomics

[0020] Stimulant administration in man mimics many of the signs andsymptoms of psychiatric disorders. For example, it is intended that theapproach of the present invention will find use with various animalmodels and associations with mapping and identification ofsusceptibility genes that are involved in numerous psychiatric and otherdiseases. However, it is not intended that the present invention belimited to the administration of any particular stimulant or indeed, anyother compound. Nor is it intended the present invention be limited toany particular animal model or any particular disease.

[0021] In specific, the association of single dose amphetamine treatmentin humans, which is known to reproduce some of the core symptoms ofmania, including increased energy, euphoria, irritability, racingthoughts, rapid speech, hyperactivity, decreased need for sleep, andpsychomotor agitation was utilized. Chronic treatment frequently resultsin psychotic symptoms that resemble psychotic mania or the positivesymptoms of schizophrenia. These clinical phenomena are consistent witha large body of data that indicate a role for dopamine in mania andpsychosis (Wilner, in Psychopharmacology: The Fourth Generation ofProgress, Bloom and Kupfer (eds.), Raven Press, New York, [1995], page921). Attempts to map genes for these disorders by positional cloninghave yielded some recent successes, with about 20 genomic regions beingimplicated by linkage studies, many of which are found in studies ofboth bipolar disorder and schizophrenia (See, Berrettini, Biol.Psychiatr. 47:245 [2000]; and Kelsoe, Curr. Psychiatr. Rep. 1:135[1999]).

[0022] One of the major difficulties in fine mapping and identificationof susceptibility genes for these and other complex genetic disorders isthe length of the linkage peaks, which are typically 20 cM or greater.Microarray technologies provide an approach that is capable ofsimultaneously examining the expression of thousands of genes. Thus,observing changes in gene expression in an amphetamine treatment animalmodel of mania and psychosis, as well as mapping the genes within theselinkage peaks has provided good candidates for disease susceptibilitygenes during the development of the present invention. This approach,referred to herein as “convergent functional genomics” provides methodsto identify any number of candidate genes for psychiatric and otherdisorders. Indeed, this approach was used during the development of thepresent invention to identity several positional candidate genes forpsychiatric disorders.

[0023] In experiments conducted during the development of the presentinvention, the rat animal model was used. This model is commonlyaccepted by those in the art for experiments involving psychiatricdisorders. In some experiments, rats were treated with a single dose ofmethamphetamine (4 mg/kg) and sacrificed 24 hours later. This timepointwas chosen as that most likely to detect changes of relevance to maniaand psychosis. It was hypothesized that at 24 hours, most short termgene induction relevant to acute intoxication and behavioral activationwould have subsided. Furthermore, 24 hours after a single moderate tohigh dose, animals already exhibited a sensitized response to a secondamphetamine challenge. As mania and psychosis are typically chronicprocesses in man, more persistent gene changes are more likely to becentral to pathophysiological mechanisms. Gene expression was examinedin the prefrontal cortex and amygdala, using the Afymetrix U34AGeneChip, which interrogates approximately 7,000 known genes and 1,000ESTs (expressed sequence tags) using an oligonucleotide nicroarray (See,Lipshutz et al., Nat. Genet. 21:20 [1999]). These brain regions werechosen based on the extensive literature that highlights their centralrole in cognition and emotion (See e.g., Heimer and Alheid, Adv. Exp.Med. Biol. 1:295 [1991]).

[0024] A two-fold increase or decrease in expression was chosen as aconventional empirical cut-off. Thus, at least a two-fold change in eachof two independent animal experiments was used to select those geneswith the most robust and reproducible change in expression. In eachexperiment, pooled tissues from three methamphetamine-treated and threecontrol rats were used. In these analyses, standard default settings ofthe Affymetrix GeneChip Expression Algorithm were used. A gene had to becalled “Present” and “Changed,” in at least one out of two experimentsand had to have an Average Difference Change greater than 50, as well asa fold change greater than 2 in two out of two experiments. Genesmeeting this criteria are summarized in Table 1, for the prefrontalcortex (PFC) and Table 2, for the amygdala (AMY). The genes that wereinduced more than two-fold in both experiments were also identified bytheir GenBank accession numbers, as indicated in Tables 1 and 2. A genewas scored as mapping to a linkage region for either schizophrenia (S)or bipolar disorder (B) if its human homologue mapped to within 10 cM ofa marker for which at least suggestive evidence of linkage had beenreported.

[0025] The chromosomal locations of the human homologues of these geneswere then compared with published linkage reports for bipolar disorderand schizophrenia, as well as data generated during the development ofthe present invention to cross-validate the results and identifyhigh-probability candidate genes. The human homologues and humanchromosomal map locations were determined using the NCBI database.GeneCard (Weizmann Institute), a comprehensive database containing allof the various information available regarding known genes and theirfunctions was also used for each gene identified in the screen. Geneswere considered to be positional candidates (i.e., close to a genomichotspot) if they mapped to within 10 cM of a marker for which there wasat least one report of suggestive evidence of linkage (Lander andKruglyak, Nat. Genet. 11:241 [1995]). The Marshfield integrated linkagemap was used as a reference for genetic location. As shown in Tables 1and 2, eight of these genes met the criteria used in the analyses. Itwas also noted that a number of interesting genes were very narrowlypositioned below this threshold. An indication of the specificity of theresult is that GRK2, a close homologue of GRK3, demonstrated no changein expression in either experiment (fold changes of 1.1 and 1.0 in twoexperiments). TABLE 1 Candidate Genes Reproducibly Induced in thePrefrontal Cortex (PFC) Human Accession # Gene Fold Chromosomal LinkageRat/Human Symbol Description Induction Location Region M87855/NM_005160GRK3 G protein-coupled 14.2 22q11 B receptor kinase 3 J03179/U48213 DBPD-box binding 7.0 19q13.3 B protein M95591/X69141 FDFT1 Farnesyl- 2.98p23.1-p22 S diphosphate farnesyltransferase AF090134/AF173081 MALS-1Vertebrate LIN7 2.9 12q21.3 B homolog 1

[0026] TABLE 2 Candidate Genes Reproducibly Induced in the Amygdala(AMY) Human Accession # Gene Fold Chromosomal Linkage Rat/Human SymbolDescription Induction Location Region AA799479/AF038406 NDUFS8NADH-coenzyme 20.8 11q13 Q reductase L19998/L19999 SULT1A1Sulfotransferase 4.3 16p12.1-p11.2 B 1A1 AB017711/Z27113 POLR2F RNApolymerase 3.9 22q13.1 B, S II polypeptide F X14323/U12255 FCGRT IgG Fcreceptor 3.2 19q13.3 B transporter alpha M81183/X57025 IGF1 Insulin-like3.0 12q22-q24.1 B growth factor I AA998683/(AJ224874) HSPB1 Heat-shock2.8 7q22.1 EST¹ protein 27 S62933/U05012 NTRK3 Neurotropin 2.7 15q25receptor 3 X59249/L77730 ADORA3 Adenosine 2.7 1p21-p13 receptor A3U64689/U69140 FEZ2 Fasciculation and 2.3 2p22 elongation protein zeta 2(Zygin II)

[0027] For six of the eight genes identified that met the criteria, itis contemplated that these genes have a role in the pathophysiologyassociated with disease. These six genes implicated by a convergence ofdata from both amphetamine response and clinical linkage studiesrepresent compelling and novel candidates for disease susceptibilityloci. Their map locations and contemplated roles in psychiatric diseaseare discussed in greater detail below. However, an understanding of themechanism(s) involved in these genes is not necessary in order to usethe present invention. Nonetheless, it is also not intended that thepresent invention be limited to any particular mechanism(s). It iscontemplated that these genes will find use in various assay andanalytical systems, including but not limited to the convergentfunctional genomics described herein, as well as cell culture and othertesting systems (e.g., for gene and protein-based therapies, drugdevelopment, etc.).

[0028] A. G Protein-Coupled Receptor Kinase 3 (GRK3)

[0029] The GRK3 gene maps to human chromosome 22q11, and is alsoreferred to as “beta adrenergic receptor kinase 2” (BARK2). This regionhas been implicated in bipolar disorder by the present inventors andothers (See e.g., Lachman et al., Am. J Med. Geizet. 74:121 [1996];Kelsoe et al., Am. J Med. Genet. 81:461 [Abstract] [1998]; Edenberg etal., Am. J Med. Genet. 74:238 [1997]; and Detera-Wadleigh et al., Proc.Natl. Acad. Sci. USA 96:5604 [1999]). Indeed, 22q yielded the highestlod scores of any chromosomal region in the genome survey utilizedduring development of the present invention. Consistent with manyfindings in this field, this linkage peak was broad and spanned nearly20 cM. One of the highest lod scores in this region was 2.2 at D22S419,which maps to within 40 kb of GRK3. This marker is also quite close tothe markers identified in the two other independent positive linkagereports for 22q in bipolar disorder. A marker within the GRK3 gene,D22S315, has also been implicated in a study of eye tracking and evokedpotential abnormalities in schizophrenia (See, Myles-Worsley et al., Am.J. Med. Genet. 88:544 [1999]).

[0030] The known physiological role of GRK3 in desensitization ofreceptors and its map location make it one of the more interestingcandidates identified during the development of the present invention.In the continuing presence of high agonist concentrations, Gprotein-coupled receptor (GPCR) signaling is rapidly terminated by aprocess termed “homologous desensitization.” Homologous desensitizationof many agonist-activated GPCRs begins when G protein receptor kinases(GRKs) phosphorylate serine and threonine residues on the receptor'scytoplasmic tail and/or third intracellular loop (Pitcher et al., Ann.Rev. Biochem. 67:653 [1998]). The consequent binding of β-arrestin tophosphorylated GPCRs decreases their affinity for cognate heterotrimericG proteins, thereby uncoupling the receptor from the G-βγ subunit bysteric hindrance. In addition, dopamine D1 receptors can bephosphorylated and desensitized via a GRK3 mechanism (Tiberi et al., J.Biol. Chem. 271:3771 [1996]). Also, GRK3 expression is particularly highin doparninergic pathways in the central nervous system (Arriza et al.,J. Neurosci. 12:4045 [1992]). While an understanding of the mechanism(s)is not necessary in order to use the present invention, these data areconsistent with results observed during the development of the presentinvention that indicate GRK3 exerts an important regulatory effect onbrain dopamine receptors. Because dopamine receptors play an importantrole in the action of amphetamine on the brain, it is believed thatamphetamine-induced up-regulation of GRK3 counter-regulates dopaminereceptor signalling initiated by mesocorticolimbic dopamine release.Indeed, this gene undergoes a dramatic up-regulation in rat frontalcortex in response to amphetamine challenge. However, it is not intendedthat the present invention be limited to any particular mechanism(s).

[0031] These data suggest that an apparent major physiological role forGRK3 in neurons is to act as a brake to limit excessive neural activityby inactivating G protein-coupled receptors. It is contemplated thatdefects in GRK3 function are associated with the inability todesensitize, resulting in a heightened responsiveness to dopaminesignals in the brain. It is contemplated that in at least some cases,such genetic variation influences individual variation in behavioralsensitization to stimulants in humans and other animals. It is furthercontemplated that the present invention will provide means to predictwhether individuals with mania have either low levels of the normalprotein or high levels of mutated hypoactive protein. Conversely, it iscontemplated that individuals with depression have either high levels ofthe normal protein or normal levels of mutated hyperactive protein.Indeed this predictive model is supported by post-mortem studies inpeople who had depression that led to suicide and who had increasedlevels of GRK2/3 protein in their PFC (Garcaia-Sevilla et al., J.Neurochem. 72:282 [1999]).

[0032] In order to test this hypothesis, levels of GRK3 protein inlymphoblastoid cell lines of individuals with bipolar disorder fromfamilies with evidence of linkage to 22q11 were tested (See, Example 5).Consistent with this model, three out of six such subjects demonstratedreduced expression of GRK3. These data suggest that a defect intranscriptional regulation in GRK3 contributes to the susceptibility tobipolar disorder in a subset of individuals. Thus, functional defects inthis gene appear to prevent the normal desensitization to dopamine orother neurotransmitters, resulting in predisposition to psychiatricdisorder(s).

[0033] During the development of the present invention, it was alsodetermined that the defect in GRK3 appears to be a variation insequences that regulate transcription of the gene. The gene was screenedand no evidence of coding sequence defects was found. However, sixsequence variants that may affect promoter function were identified(See, Example 3 and FIGS. 1 and 2). Thus, it is contemplated that thepresent invention will find use in screening and identifying drugs thataugment GRK3 expression and/or function.

[0034] B. D Box Binding Protein (DBP)

[0035] D box binding protein (DBP) is a CLOCK-controlled transcriptionalactivator (Ripperger et al., Genes Dev. 14:679 [2000]), that shows arobust circadian rhythm. In mouse experiments (Yan et al., J. Neurosci.Res. 59:291 [2000]), its highest level of expression in the brain wasfound to be in the suprachaismatic nucleus (SCN), but it is also presentin the cerebral cortex and caudate-putamen. In the SCN, DBP mRNA levelsshowed a peak at early daytime (ZT/CT4) and a trough at early nighttimein both light-dark and constant dark conditions. In the cerebral cortexand caudate-putamen, DBP mRNA was also expressed in a circadian manner,but the phase shift of DBP mRNA expression in these structures showed a4-8 hour delay compared to the SCN. These data implicate DBP as an armof the circadian clock. DBP knockout mice show reduced amplitude of thecircadian modulation of sleep time, as well as a reduction in theconsolidation of sleep episodes (Franken et al., J. Neurosci. 20:617[2000]). Some clock genes have been shown to be essential for thedevelopment of behavioral sensitization to repeated stimulate exposure(Andretic et al., Science 285:1066 [1999]). Circadian rhythmabnormalities have also been implicated in mood disorders (See e.g.,Kripke et al., Biol. Psychiatr. 13:335 [1978]; and Bunney and Bunney,Neuropsychopharmacol. 22:335 [2000]).

[0036] DBP maps to chromosome 19q13.3. Chromosome 19 has not been astrong linkage region for psychiatric disorders, although one study hasimplicated this region in a large Canadian kindred with bipolar disorder(Morissette et al., Am. J. Med. Genet. 88:567 [1999]). In this sample,D19S867, which is approximately 2 cM from DBP yielded a lod score of2.6. Taken together, the connections between clock genes, stimulantsensitization and circadian rhythmicity suggest a potential role for DBPin mood disorders.

[0037] C. Farnesyl-diphosphate Farnesyltransferase 1 (FDFT1)

[0038] FDFT1, also known as “human squalene synthase” (HSS), is involvedin the first step of sterol biosynthesis uniquely committed to thesynthesis of cholesterol (Schechter et al., Genomics 20:116 [1994]). Assuch, it has received attention as a target for the development ofcholesterol-lowering drugs. Interestingly, primary prevention humantrials have shown a correlation between lowering cholesterol andsuicide, postulated to occur due to lowering the numbers of serotoninreceptors in synapses (Engelberg, Lancet 339:727 [1992]). Studies inmonkeys have also shown an association between cholesterol and centralserotonergic activity (Kaplan et al., Ann. NY Acad. Sci. 836:57 [1997]).Mice homozygously disrupted for the squalene synthase gene exhibitedembryonic lethality and defective neural tube closure, implicating denovo cholesterol synthesis in nervous system development (Tozawa et al.,J. Biol. Chem. 274:30843 [1999]). Moreover, de novo cholesterolsynthesis was shown to be important for neuronal survival., and apoE4,which is a major risk factor for Alzheimer's disease, has beenimplicated in inducing neuronal cell death through the suppression of denovo cholesterol synthesis (Michikawa and Yanagisawa, Mech. Ageing Dev.107:223 [1999]). As such, it is contemplated that neuronal cholesterolsynthesis, of which squalene synthase is a key regulator, is positivelycorrelated with both elevated mood and neuronal survival. Nonetheless,an understanding of the mechanism(s) is not necessary in order to usethe present invention, nor is it intended that the present invention belimited to any particular mechanism(s).

[0039] FDFT1 is located on 8p23.1-p22, near the telomere. Numerousstudies have implicated 8p in both schizophrenia and bipolar disorder.However, most of these results are about 40-50 cM centromeric to FDFT1.Two studies have reported evidence for linkage to schizophrenia within10 cM of FDFT1. Wetterberg et al. (Wetterberg et al., Am. J. Med. Genet.81:470 [Abstract] [1998]), reported a lod score of 3.8 at D8S264, in alarge Swedish isolate. The NIMH Schizophrenia Genetics Consortium alsoreported evidence implicating a broad area of 8p in African Americanpedigrees, including two putative peaks, with one at D8S264 (NPL Z score2.3) (Kaufinann et al., Am. J. Med. Genet. 81:282 [1998]).

[0040] D. Vertebrate LIN7 Homolog 1 (MALS-1 or VELI1)

[0041] MALS-1 is a PDZ domain-containing cytoplasmic protein that isenriched in brain synapses where it associates in complexes with PSD-95and NMDA type glutamate receptors (Jo et al., J. Neurosci. 19:4189[1999]). It has been implicated in regulation of neurotransmitterreceptor recruitment to the post-synaptic density, as well as being partof a complex with CASK and Mint 1 that couples synaptic vesicleexocytosis to cell adhesion (Butz et al., Cell 94:773 [1998]).

[0042] MALS-1 maps to 12q21.3, in a region implicated in several studiesof bipolar disorder. This region was first reported in bipolar disorderthrough observation of a Welsh family in which bipolar disorder andDarier's disease co-segregated (Dawson et al., Am. J. Med. Genet. 60:94[1995]). Though the Darier's region is somewhat distal to MALS-1,Morisette et al. reported evidence of linkage of bipolar disorder tomarkers on 12q, with a maximum at D12S82 (Z_(all) 4.0, lod score 2.2),which is approximately 2 cM from MALS-1 (Morisette et al., supra).

[0043] E. Sulfotransferase 1 A1 (SULT1A1)

[0044] SULT1A1 is a sulfotransferase that inactivates dopamine and otherphenol-containing compounds by sulfation. It is contemplated as playinga role in limiting the neuronal stimulatory and psychosis promotingeffects of dopamine. Though it is not a primary regulator of synapticdopamine concentration, a defect in this gene could lead to impairedclearing of dopamine from the extracellular space with a resultingamphetamine-like effect. SULT1A1 has not yet been precisely mapped, butcytogenetic data locate it to chromosome 16p12.1-p11.2, near a genomiclocus implicated in bipolar disorder (D16S510, lod score 2.5) (Ewald etal., Psychiatr. Genet. 5:71 [1995]), and alcohol dependence (D16S675,lod score 4.0)(Foroud et al., Alcohol Clin. Exp. Res. 22:2035 [1998]).

[0045] F. Insulin-Like Growth Factor 1 (IGF1)

[0046] IGF1 stimulates increased expression of tyrosine hydroxylase, therate limiting enzyme in the biosynthesis of dopamine (Hwang and Choi, J.Neurochem. 65:1988 [1995]). It has also been shown to have trophiceffects on doparnine brain neurons and to protect doparnine neurons fromapoptotic death (Knusel et al., Adv. Exp. Med. Biol. 293:351 [1991]).IGF1 also induces phosphatidylinositol 3-kinase survival pathwaysthrough activation of AKT1 and AKT2; it is inhibited by TNF in itsneuroprotective role. IGF1 gene disruption in mice results in reducedbrain size, CNS hypomyelination, and loss of hippocampal granule andstriatal parvalbumin-containing neurons (Beck et al., Neuron 14:717[1995]). Defects of IGF1 in humans produce growth retardation withdeafness and mental retardation. IGF1 is located on chromosome12q22-q24.1. It is at a map position of 109 cM, 13 cM telomeric toMALS-1, and is in the same 40 cM region described above. This region isimplicated in bipolar disorder and extends from D12S82 at 96 cM (NPLZ_(all) 4.0) (Morisette et al., supra) to PLA2 at 136 cM (lod score2.49) (Dawson et al., supra).

[0047] G. Additional Genes

[0048] Two additional genes met the criteria of reproducibility andmapping to a linkage region, but their functions identified to date makethem less likely to be disease gene candidates. RNA polymerase IIpolypeptide (POLR2F) maps to 22q13.1, approximately 10 cM distal toD22S278, which has been implicated in several studies of both bipolardisorder and schizophrenia, as described above. POLR2F is responsiblefor mRNA production and may control cell size (Schmidt and Schibler, J.Cell Biol. 128:467 [1995]), and overall body morphological features(Bina et al., Prog. Nucl. Acid Res. Mol. Biol. 64:171 [2000]). It ismore active in metabolically active cells (Schmidt and Schibler, supra).FCGRT is a receptor for the Fc component of IgG. It structurallyresembles the major histocompatibility class I molecule (Kandil et al.,Cytogenet. Cell Genet. 73:97 [1996]). FCGRT maps to 19q13.3, near DBPand a marker implicated in bipolar disorder, as discussed above. It iscontemplated that activation of these genes is a secondary effect ofamphetamine and their mapping near linkage regions is coincidental.

[0049] Several other genes did not meet the stringent criteria used inthe development of the present invention. For example, fibroblast growthfactor receptor 1 (FGFR1) had an average fold change of 4.1, though theincrease was only 1.8 fold in one of the two experiments. Increasedexpression of astrocytic basic FGF in response to amphetamine waspreviously demonstrated (Flores et al., J. Neurosci. 18:9547 [1998]).Furthermore, FGF-2, a ligand for FGFR1 has been shown to regulateexpression of tyrosine hydroxylase, a critical enzyme in dopaminebiosynthesis (Rabinovsky et al., J. Neurochem. 64:2404 [1995]). FGFR1maps to chromosome 8p11.2-p11.1, approximately 10 cM centromeric to agenomic locus near D8D1771 (8p22-24), which demonstrated evidence oflinkage to schizophrenia in several studies (See e.g. Blouin et al.,Nat. Genet. 20:70 [1998]; Kendler et al., Am. J Psychiatr. 153:1534[1996]; and Levinson et al., Am. J. Psychiatr. 155:741 [1998]). Heatshock 27 kD protein 1 (HSP27, HSPB1) has been implicated in stressresistance responses in a variety of tissues. It is hypothesized that itplays a role in promoting neuronal survival (See e.g. Lewis et al., J.Neurosci. 19:8945 [1999]), and may be induced in the brain by kainicacid-induced seizure (Kato et al., J. Neurochem. 73:229 [1999]). HSPB1maps to 7q22.1, approximately 20 cM from a region implicated in bipolardisorder in two independent samples (Detera-Wadleigh et al., Am. J. Med.Genet. 74:254 [1997]; and Detera-Wadleigh et al., Proc. Natl. Acad. Sci.USA 96:5604 [1999]).

[0050] In view of the number of genomic regions that have beenimplicated in bipolar disorder and schizophrenia, it was considered tobe important to evaluate the probability that some of the genesidentified during the development of the present invention mapped to adisease locus by chance. As indicated above, it was required that a genemap to within 10 cM of a marker identified in at least one study, ashaving suggestive evidence of linkage. Assuming that the average genomicregion meeting the criteria used in the present invention is 30 cM long,and approximately 20 such regions have been reported, then about 20percent of the genome is implicated in bipolar disorder orschizophrenia. Therefore, there is about a 20 percent probability that agene will fall within a putative linkage region by chance. However, theanimal model gene expression methods of the present invention identifiedabout 1 in 1000 genes as being changed. Assuming that there are 75,000genes in the genome, then each 30 cM linkage region would contain onaverage, 750 genes, and the approach of the present invention wouldidentify approximately 1 gene. Thus, there is an estimated probabilityof 1 in 5,000 that a gene would meet both criteria by chance. Clearly,not all genes identified by this approach are genes for these disorders.Nonetheless, the present invention provides methods that are useful inthe diagnosis and treatment of psychiatric disorders.

[0051] Using methods presently known in the art, definitiveidentification of disease genes typically requires the discovery of apolymorphism of functional significance and its association withillness. In addition, large-scale sequencing of both coding andnon-coding regions in numerous affected individuals is needed. Assumingthat there is an average of 750 genes per linkage region, thisrepresents an enormous task. In contrast, the convergent functionalgenomics approach of the present invention provides a relevant animalmodel, methods and compositions to identify a small number of candidatesfor exhaustive mutation screening. Thus, the present invention providesmethods that effectively reduce the scale of such a project by severalhundred fold.

[0052] It is further contemplated that the high-probability candidategenes for mania and psychosis identified using the convergence of animalmodel data and human genetic linkage data will be studied in detail forgenomic variation in clinical populations and behavioral variation inknockout animal models. In addition, it is contemplated that theconvergent functional genomics methods of the present invention willfind use with various other polygenic diseases. Indeed, it is notintended that the present invention be limited to psychiatric diseasesnor any other particular disease syndrome.

[0053] II. Psychogenes and Psychosis-Suppressor Genes

[0054] The present invention provides evidence that genes involved inpsychiatric disorders can be placed into two prototypical categories.Genes whose activity promotes processes that lead to mania or psychosisare referred to herein as “psychogenes” (i.e., analogous to oncogenes).Conversely, genes whose activity suppresses processes that lead to thesepsychiatric disorders are referred to herein as “psychosis-suppressor”genes (i.e., analogous to tumor suppressor genes). Thus, based on theresults observed during the development of the present invention, DBP,FGFR1, NTRK3, FDFT1, MALS-1, IGF1 are psychogenes, while GRK3, SULT1A1,and ADORA3 are psychosis-suppressor genes. However, it is not intendedthat the present invention be limited to these particular genes. Indeed,it is contemplated that additional genes and variants will be identifiedusing the methods and compositions of the present invention. Althoughthis classification is simplistic, it has heuristic value forpsychiatric illness. It is contemplated that this classification willfind use in considerations regarding the role(s) of these putativedisease genes in pathophysiology and as targets for therapeuticintervention.

[0055] In particularly preferred embodiments, the present inventionfinds use in the identification and characterization of dysfunctions inthese genes. In some embodiments, the DNA of patients with psychiatricdisorders (e.g., bipolar disorder, schizophrenia, etc.) is screened inorder to detect DNA sequence variants that are associated with or leadto dysfunction of these genes. In other embodiments, DNA of patientssuspected of suffering from psychiatric disorders, as well as DNA ofnormal subjects who wish to be screened for psychiatric disorders, istested to screen for the presence of these genetic variants and predictrisk for psychiatric illness later in life. In some embodiments, thesemethods find use in clarifying and/or confirming diagnosis ofpsychiatric disorders.

[0056] In addition to the diagnostic value of these methods, the presentinvention also provides means to determine the prognosis of affectedpatients, as well as predict treatment outcomes. For example, in someembodiments, patients with psychiatric illness who have been treatedwith medication are tested for these genetic variants, in order todetermine the treatment efficacy, as well as to gather evidence as tothe medications that are useful in treatment of patients suffering fromparticular psychiatric disorders. In other embodiments, cells from thesepatients may be used to assess the treatment efficacy of one or moredrugs. In some embodiments, screening tests based on binding and/orfunctional blockade of dopamine receptors, the dopamine transporter,other neurotransmitter receptors, and/or transporters are used toidentify useful compounds in cell culture and/or animal models (e.g.,the rat model described in Example 1). In some embodiments, testcompounds are compared with compounds known to block dopamine receptors,the dopamine transporter, and/or other neurotransmitter receptors ortransporters. In still further embodiments, it is contemplated thatsimple tests will find use in monitoring the ongoing response ofpatients to treatment. For example, it is contemplated that a blood testfor GRK3 expression will find use in monitoring the efficacy of patienttherapy. However, it is not intended that the present invention belimited to GRK3 expression, as any suitable protein finds use in thepresent invention.

[0057] It is contemplated that the present invention will also find usein screening and identifying drugs that interact with a dysfunctionalprotein to enhance its function. In some embodiments, the presentinvention provides methods and compositions to screen drugs thatinteract with the gene itself or proteins that bind regulatory sequencesin the gene and thereby enhance transcription. Thus, it is contemplatedthat any of several upstream targets will be identified using thepresent invention, based on their role(s) in regulating the expressionof gene(s) of interest. It is further contemplated that such upstreamtargets will provide superior points of drug intervention and find usein drug design. Indeed, the in vivo and in vitro methods of the presentinvention provide the means to monitor the expression and/or function ofupstream targets based on their ability to indirectly modify thefunction of the dysfunctional gene or protein. Similarly, proteinsdownstream of the dysfunctional protein that are involved in thefunctional pathway of the dysfunctional gene/protein also find use, aswell as proteins that interact with and/or facilitate the overallfunction of the dysfunctional gene. Thus, the present invention providesmethods and compositions for the upstream and downstream assessment oftest compounds in functional assay systems.

[0058] In addition to the diagnostic, prognostic and drug assessmentadvances provided by the present invention, the present invention alsoprovides methods and compositions suitable for use in gene therapyregimens. In gene therapy embodiments, treatments that find use increasethe expression and/or function of psychosis-suppressor genes and/ordecrease the expression and/or function of psychogenes. For example, itis contemplated that GRK3 represents an ideal target for gene therapymethods. The genetic defect appears to be a hypomorph that manifestsphenotypically as a recessive trait. Thus, it is contemplated that genetherapy methods that increase or normalize expression of GRK3 inrelevant brain regions will find use in treatment of psychiatricdisorders (e.g., bipolar disorder, schizophrenia, etc.). However, it isnot intended that the present invention be limited to GRK3 protein, asany suitable protein finds use in these methods.

[0059] It is also contemplated that the present invention will find usein protein-based therapies. In these regimens, the protein is delivereddirectly to the cells deficient in the function of a particulargene/protein. Thus, it is contemplated that any suitable method for thedelivery of proteins for therapeutic purposes will find use in thepresent invention, including but not limited to such methods as the useof fusion proteins (e.g., fusion proteins that include a “passport”domain which facilitates transport of proteins across cell membranes).For example, it is contemplated that normal GRK3 protein will bedelivered to neurons using one or more of these methods. However, it isnot intended that the present invention be limited to GRK3 protein, asany suitable protein finds use in these methods.

[0060] III. Cell Culture Methods

[0061] In addition to the convergent functional genomics methodsdescribed above in which in vivo experimental results are used inconjunction with genome mapping data, the present invention alsoprovides cell culture methods to detect and characterize psychogenes andpsychosis-suppressor genes (described in greater detail below). Inaddition, these methods find use in the screening and detection ofcompounds that change the function of these genes. For example, it iscontemplated that these cell culture methods will find use in detectionof compounds that increase the action of psychosis suppressor genes ineither or both the basal and agonist-challenged states.

[0062] In some embodiments, lymphoblastoid cell lines (e.g., similar tothose described in Example 3) are exposed to various compounds. Inparticularly preferred embodiments, cells from normal control subjects,and cells from subjects with at least one psychiatric disorder (e.g.,bipolar disorder) are tested and compared. In some embodiments, thecells are tested under conditions in which the cells are exposed to thetest compound alone, as well as under conditions in which the cells arealso challenged with a dopamine agonist. In particularly preferredembodiments, cells from subjects with bipolar disorder who are shown tohave defects in the genes described above are used. In these analyses,testing parameters include mRNA expression of the gene of interest,protein expression, and/or functional measures specific for each gene ofinterest. In yet other particularly preferred embodiments, compounds ofinterest increase the expression and function of psychosis-suppressorgenes and/or decrease the expression and function of psychogenes in thebasal state and preferably in the presence of the dopamine agonist.

[0063] Definitions

[0064] As used herein, the term “mood” refers to an individual'senduring emotional state, while “affect” refers to short-termfluctuations in emotional state. Thus, the term “mood disorder” is usedin reference to conditions in which abnormalities of emotional state arethe core symptoms. The most common serious mood disorders reportedlyseen in general medical practice are major depression (unipolardepression), dysthymic disorder (chronic, milder form of depression),and bipolar disorder (manic-depressive illness).

[0065] As used herein, the term “psychiatric disorder” refers tomental., emotional., or behavioral abnormalities. These include but arenot limited to bipolar disorder, schizophrenia, schizoaffectivedisorder, psychosis, depression, stimulant abuse, alcoholism, panicdisorder, generalized anxiety disorder, attention deficit disorder,post-traumatic stress disorder, and Parkinson's disease.

[0066] The term “bipolar disorder,” as used herein, refers to any ofseveral mood disorders characterized usually by alternating episodes ofdepression and mania (e.g., bipolar disorder I) or by episodes ofdepression alternating with mild nonpsychotic excitement or hypomania(e.g., bipolar disorder II). Individual's at risk of developing bipolardisorder include those with a family history of bipolar disorder. Thoseat greatest risk have first degree relatives which are diagnosed withbipolar disorder I or II.

[0067] The terms “gene associated with a psychiatric disorder” and“psychiatric gene,” as used herein, refer to genes whose activity playsa role in the processes leading to development of psychiatric disorders.This role may be one of promotion or suppression and thus encompassesboth psychogenes and psychosis-suppressor genes.

[0068] The terms “marker associated with” and genetic linkage refer tothe greater association in inheritance of two or more nonallelic genesthan would be expected from independent assortment (e.g., genes arelinked because they reside near each other on the same chromosome).

[0069] As used herein, the term “psychogenes” refers to genes whoseactivity promotes processes that lead to mania or psychosis (i.e.,analogous to oncogenes). Conversely, genes whose activity suppressesprocesses that lead to mania or psychosis are referred to herein as“psychosis suppressor genes” (i.e., analogous to tumor suppressorgenes).

[0070] The terms “microarray,” “GeneChip,” “genome chip,” and “biochip,”as used herein, refers to an ordered arrangement of hybridizeable arrayelements. The array elements are arranged so that there are preferablyat least one or more different array elements on a substrate surface.The hybridization signal from each of the array elements is individuallydistinguishable. In a preferred embodiment, the array elements compriseoligonucleotides, although the present invention could also be used withcDNA or other types of nucleic acid array elements.

[0071] As used herein, the term “altered expression” refers todifferences in gene expression observed upon comparing cells incubatedunder test and control conditions. This term encompasses both induced(e.g., increased expression) genes and repressed (e.g., decreasedexpression) genes. In preferred embodiments the fold change inexpression between test and control conditions is greater than two in atleast two experiments.

[0072] The term “hybridization score,” as used herein refers to thedegree of binding observed between a probe and a nucleic acid arrayelement of the microarray or GeneChip. In some embodiments, this scoreis determined by measuring the fluorescence intensity of a labelledprobe, although this invention is not limited to the use of fluorescentquantification techniques.

[0073] As used herein, the term “labelled” refers to the attachment of atraceable constituent to a biological molecule in order to more easilyquantify or trace the biological molecule of interest. In someembodiments, the label may be a fluorescent, luminescent, enzymatic orradioactive label. For instance, probe hybridization to a nucleic acidarray element may be measured by directly or indirectly (e.g. via abiotin/avidin or a biotin/streptavidin linkage) attaching aphycoerythrin or fluorescein tag to the probe.

[0074] As used herein, the term “sequence variation” refers todifferences observed in nucleic acid sequence between individuals.“Sequence variation” includes both “single nucleotide polymorphisms,” aswell as larger stretches of differences.

[0075] The term “single nucleotide polymorphism” (SNP), refers to singledifferences observed in a given position of a nucleic acid sequencebetween individuals. These polymorphisms may be the result of pointmutations and include substitutions such as transitions andtransversions. “Transitions” are a change of a pyrimidine nucleotide, Cor T, into an other pyrimidine nucleotide, or a change of a purinenucleotide, A or G, into an other purine nucleotide. “Transversions” area change of a pyrimidine nucleotide, C or T, into a purine nucleotide, Aor G, or vice versa Transitions are more common than transversions. Asused herein, the term SNP also includes single nucleotide deletions andinsertions.

[0076] As used herein, the term “human homologue” refers to a human genewhich shares a common ancestor with a gene from another species.Homologous genes can be identified as such by determining the percentidentity of two nucleic acid sequences or can be inferred by comparingthe predicted structure of the proteins encoded by these genes.

[0077] A “variant” of a protein of interest, as used herein, refers toan amino acid sequence that is altered by one or more amino acids. Thevariant may have “conservative” changes, wherein a substituted aminoacid has similar structural or chemical properties, (e.g., replacementof leucine with isoleucine). More rarely, a variant may have“nonconservative” changes (e.g., replacement of a glycine with atryptophan). Similar minor variations may also include amino aciddeletions or insertions, or both. Guidance in determining which aminoacid residues may be substituted, inserted, or deleted withoutabolishing biological or immunological activity may be found usingcomputer programs well known in the art, for example, DNASTAR software.

[0078] As used herein, the terms “translation start site” and “startcodon” refer to the ATG or AUG encoding the first amino acid moiety(e.g., methionine) of a nascent polypeptide chain. This may not be thefirst ATG or AUG codon found in the message and the methionine encodedby this triplet may not be present in the processed, mature form of thepolypeptide or protein.

[0079] The term “biologically active,” as used herein, refers to aprotein or other biologically active molecule (e.g., catalytic RNA)having structural., regulatory, or biochemical functions of a naturallyoccurring molecule. Likewise, “immunologically active” refers to thecapability of the natural., recombinant, or synthetic protein or anyoligopeptide or polynucleotide thereof, to induce a specific immuneresponse in appropriate animals or cells and to bind with specificantibodies.

[0080] As used herein, the term “dopamine agonist” refers to anycompound which has activities similar to that of dopamine by virtue ofbinding to dopamine receptors. The dopamine agonists of the presentinvention include but are not limited to amphetamine, methamphetamine,cocaine and methylphenidate.

[0081] The term “agonist,” as used herein, refers to a molecule which,when bound to a compound of interest, causes a change in the compound,which modulates the activity of the compound. Agonists may includeproteins, nucleic acids, carbohydrates, or any other molecules whichbind or interact with the compound.

[0082] The terms “antagonist” and “inhibitor,” as used herein, refer toa molecule which, when bound to a compound of interest, blocks ormodulates the biological or immunological activity of the compound ofinterest. Antagonists and inhibitors may include proteins, nucleicacids, carbohydrates, or any other molecules which bind or interact withthe compound of interest.

[0083] The term “modulate,” as used herein, refers to a change or analteration in the biological activity of a compound of interest.Modulation may be an increase or a decrease in protein activity, achange in binding characteristics, or any other change in the biologicalfunctional, or immunological properties of the compound of interest.

[0084] The term “gene” refers to a nucleic acid (e.g., DNA) sequencethat comprises coding sequences necessary for the production of apolypeptide or precursor. The polypeptide can be encoded by a fulllength coding sequence or by any portion of the coding sequence so longas the desired activity or functional properties (e.g., enzymaticactivity, ligand binding, signal transduction, etc.) of the full-lengthor fragment are retained. The term also encompasses the coding region ofa structural gene and includes sequences located adjacent to the codingregion on both the 5′ and 3′ ends for a distance of about 1 kb on eitherend such that the gene corresponds to the length of the full-lengthmRNA. The sequences which are located 5′ of the coding region and whichare present on the mRNA are referred to as 5′ non-translated sequences.The sequences which are located 3′ or downstream of the coding regionand which are present on the mRNA are referred to as 3′ non-translatedsequences. The term “gene” encompasses both CDNA and genomic forms of agene.

[0085] A genomic form or clone of a gene contains the coding regioninterrupted with non-coding sequences termed “introns” or “interveningregions” or “intervening sequences.” Introns are segments of a genewhich are transcribed into nuclear RNA (hnRNA); introns may containregulatory elements such as enhancers. Introns are removed or “splicedout” from the nuclear or primary transcript; introns therefore areabsent in the messenger RNA (mRNA) transcript. The mRNA functions duringtranslation to specify the sequence or order of amino acids in a nascentpolypeptide.

[0086] Where “amino acid sequence” is recited herein to refer to anamino acid sequence of a naturally occurring protein molecule, “aminoacid sequence” and like terms, such as “polypeptide” or “protein” arenot meant to limit the amino acid sequence to the complete, native aminoacid sequence associated with the recited protein molecule.

[0087] In addition to containing introns, genomic forms of a gene mayalso include sequences located on both the 5′ and 3′ end of thesequences which are present on the RNA transcript. These sequences arereferred to as “flanking” sequences or regions (these flanking sequencesare located 5′ or 3′ to the non-translated sequences present on the mRNAtranscript). The 5′ flanking region may contain regulatory sequencessuch as promoters and enhancers which control or influence thetranscription of the gene. The 3′ flanking region may contain sequenceswhich direct the termination of transcription, post-transcriptionalcleavage and polyadenylation.

[0088] The term “wild-type” refers to a gene or gene product which hasthe characteristics of that gene or gene product when isolated from anaturally occurring source. A wild-type gene is that which is mostfrequently observed in a population and is thus arbitrarily designed the“normal” or “wild-type” form of the gene. In contrast, the term“modified” or “mutant” refers to a gene or gene product which displaysmodifications in sequence and or functional properties (i.e., alteredcharacteristics) when compared to the wild-type gene or gene product. Itis noted that naturally-occurring mutants can be isolated; these areidentified by the fact that they have altered characteristics whencompared to the wild-type gene or gene product.

[0089] As used herein, the terms “nucleic acid molecule encoding,” “DNAsequence encoding,” and “DNA encoding” refer to the order or sequence ofdeoxyribonucleotides along a strand of deoxyribonucleic acid. The orderof these deoxyribonucleotides determines the order of amino acids alongthe polypeptide (protein) chain. The DNA sequence thus codes for theamino acid sequence.

[0090] DNA molecules are said to have “5′ ends” and “3′ ends” becausemononucleotides are reacted to make oligonucleotides or polynucleotidesin a manner such that the 5′ phosphate of one mononucleotide pentosering is attached to the 3′ oxygen of its neighbor in one direction via aphosphodiester linkage. Therefore, an end of an oligonucleotides orpolynucleotide, referred to as the “5′ end” if its 5′ phosphate is notlinked to the 3′ oxygen of a mononucleotide pentose ring and as the “3′end” if its 3′ oxygen is not linked to a 5′ phosphate of a subsequentmononucleotide pentose ring. As used herein, a nucleic acid sequence,even if internal to a larger oligonucleotide or polynucleotide, also maybe said to have 5′ and 3′ ends. In either a linear or circular DNAmolecule, discrete elements are referred to as being “upstream” or 5′ ofthe “downstream” or 3′ elements. This terminology reflects the fact thattranscription proceeds in a 5′ to 3′ fashion along the DNA strand. Thepromoter and enhancer elements which direct transcription of a linkedgene are generally located 5′ or upstream of the coding region. However,enhancer elements can exert their effect even when located 3′ of thepromoter element and the coding region. Transcription termination andpolyadenylation signals are located 3′ or downstream of the codingregion.

[0091] As used herein, the terms “an oligonucleotide having a nucleotidesequence encoding a gene” and “polynucleotide having a nucleotidesequence encoding a gene,” means a nucleic acid sequence comprising thecoding region of a gene or in other words the nucleic acid sequencewhich encodes a gene product. The coding region may be present in eithera cDNA, genomic DNA or RNA form. When present in a DNA form, theoligonucleotide or polynucleotide may be single-stranded (i.e., thesense strand) or double-stranded. Suitable control elements such asenhancers/promoters, splice junctions, polyadenylation signals, etc. maybe placed in close proximity to the coding region of the gene if neededto permit proper initiation of transcription and/or correct processingof the primary RNA transcript. Alternatively, the coding region utilizedin the expression vectors of the present invention may containendogenous enhancers/promoters, splice junctions, intervening sequences,polyadenylation signals, etc. or a combination of both endogenous andexogenous control elements.

[0092] As used herein, the term “regulatory element” refers to a geneticelement which controls some aspect of the expression of nucleic acidsequences. For example, a promoter is a regulatory element whichfacilitates the initiation of transcription of an operably linked codingregion. Other regulatory elements are splicing signals, polyadenylationsignals, termination signals, etc. (defined infra).

[0093] Transcriptional control signals in eukaryotes comprise “promoter”and “enhancer” elements. Promoters and enhancers consist of short arraysof DNA sequences that interact specifically with cellular proteinsinvolved in transcription (Maniatis et al., Science 236:1237 [1987]).Promoter and enhancer elements have been isolated from a variety ofeukaryotic sources including genes in yeast, insect and mammalian cellsand viruses (analogous control elements, i.e., promoters, are also foundin prokaryotes). The selection of a particular promoter and enhancerdepends on what cell type is to be used to express the protein ofinterest. Some eukaryotic promoters and enhancers have a broad hostrange while others are functional in a limited subset of cell types(Voss et al., Trends Biochem. Sci. 11:287 [1986]; and Maniatis et al.,supra). For example, the SV40 early gene enhancer is very active in awide variety of cell types from many mammalian species and has beenwidely used for the expression of proteins in mammalian cells (Dijkemaet al., EMBO J. 4:761 [1985]). Two other examples of promoter/enhancerelements active in a broad range of mammalian cell types are those fromthe human elongation factor 1α gene (Uetsuki et al., J. Biol. Chem.264:5791 [1989]; Kim et al., Gene 91:217 [1990]; and Mizushima andNagata, Nuc. Acids. Res. 18:5322 [1990]) and the long terminal repeatsof the Rous sarcoma virus (Gorman et al., Proc. Natl. Acad. Sci. USA79:6777 [1982]) and the human cytomegalovirus (Boshart et al., Cell41:521 [1985]).

[0094] As used herein, the term “promoter/enhancer” denotes a segment ofDNA which contains sequences capable of providing both promoter andenhancer functions (i.e., the functions provided by a promoter elementand an enhancer element, see above for a discussion of these functions).For example, the long terminal repeats of retroviruses contain bothpromoter and enhancer functions. The enhancer/promoter may be“endogenous” or “exogenous” or “heterologous.” An “endogenous”enhancer/promoter is one which is naturally linked with a given gene inthe genome. An “exogenous” or “heterologous” enhancer/promoter is onewhich is placed in juxtaposition to a gene by means of geneticmanipulation (ie., molecular biological techniques) such thattranscription of that gene is directed by the linked enhancer/promoter.

[0095] The presence of “splicing signals” on an expression vector oftenresults in higher levels of expression of the recombinant transcript.Splicing signals mediate the removal of introns from the primary RNAtranscript and consist of a splice donor and acceptor site (Sambrook etal., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring HarborLaboratory Press, New York [1989], pp. 16.7-16.8). A commonly usedsplice donor and acceptor site is the splice junction from the 16S RNAof SV40.

[0096] Efficient expression of recombinant DNA sequences in eukaryoticcells requires expression of signals directing the efficient terminationand polyadenylation of the resulting transcript. Transcriptiontermination signals are generally found downstream of thepolyadenylation signal and are a few hundred nucleotides in length. Theterm “poly A site” or “poly A sequence” as used herein denotes a DNAsequence which directs both the termination and polyadenylation of thenascent RNA transcript. Efficient polyadenylation of the recombinanttranscript is desirable as transcripts lacking a poly A tail areunstable and are rapidly degraded. The poly A signal utilized in anexpression vector may be “heterologous” or “endogenous.” An endogenouspoly A signal is one that is found naturally at the 3′ end of the codingregion of a given gene in the genome. A heterologous poly A signal isone which is one which is isolated from one gene and placed 3′ ofanother gene. A commonly used heterologous poly A signal is the SV40poly A signal. The SV40 poly A signal is contained on a 237 bpBamHI/BclI restriction fragment and directs both termination andpolyadenylation (Sambrook, supra, at 16.6-16.7).

[0097] As used herein, the terms “complementary” or “complementarity”are used in reference to polynucleotides (i.e., a sequence ofnucleotides) related by the base-pairing rules. For example, for thesequence “A-G-T,” is complementary to the sequence “T-C-A.”Complementarity may be “partial,” in which only some of the nucleicacids' bases are matched according to the base pairing rules. Or, theremay be “complete” or “total” complementarity between the nucleic acids.The degree of complementarity between nucleic acid strands hassignificant effects on the efficiency and strength of hybridizationbetween nucleic acid strands. This is of particular importance inamplification reactions, as well as detection methods which depend uponbinding between nucleic acids.

[0098] The term “homology” refers to a degree of complementarity. Theremay be partial homology or complete homology (ie., identity). Apartially complementary sequence is one that at least partially inhibitsa completely complementary sequence from hybridizing to a target nucleicacid and is referred to using the functional term “substantiallyhomologous.” The inhibition of hybridization of the completelycomplementary sequence to the target sequence may be examined using ahybridization assay (Southern or Northern blot, solution hybridizationand the like) under conditions of low stringency. A substantiallyhomologous sequence or probe will compete for and inhibit the binding(ie., the hybridization) of a completely homologous sequence or probe toa target under conditions of low stringency. This is not to say thatconditions of low stringency are such that non-specific binding ispermitted; low stringency conditions require that the binding of twosequences to one another be a specific (Le., selective) interaction. Theabsence of non-specific binding may be tested by the use of a secondtarget which lacks even a partial degree of complementarity (e.g., lessthan about 30 percent identity); in the absence of non-specific bindingthe probe will not hybridize to the second non-complementary target.

[0099] The art knows well that numerous equivalent conditions may beemployed to comprise low stringency conditions; factors such as thelength and nature (DNA, RNA, base composition) of the probe and natureof the target (DNA, RNA, base composition, present in solution orimmobilized, etc.) and the concentration of the salts and othercomponents (e.g., the presence or absence of formamide, dextran sulfate,polyethylene glycol) are considered and the hybridization solution maybe varied to generate conditions of low stringency hybridizationdifferent from, but equivalent to, the above listed conditions. Inaddition, the art knows conditions which promote hybridization underconditions of high stringency (e.g., increasing the temperature of thehybridization and/or wash steps, the use of formamide in thehybridization solution, etc.).

[0100] When used in reference to a double-stranded nucleic acid sequencesuch as a cDNA or genomic clone, the term “substantially homologous”refers to any probe which can hybridize to either or both strands of thedouble-stranded nucleic acid sequence under conditions of low stringencyas described above.

[0101] A gene may produce multiple RNA species which are generated bydifferential splicing of the primary RNA transcript. cDNAs that aresplice variants of the same gene will contain regions of sequenceidentity or complete homology (representing the presence of the sameexon or portion of the same exon on both cDNAs) and regions of completenon-identity (for example, representing the presence of exon “A” on cDNA1 wherein cDNA 2 contains exon “B” instead). Because the two cDNAscontain regions of sequence identity they will both hybridize to a probederived from the entire gene or portions of the gene containingsequences found on both cDNAs; the two splice variants are thereforesubstantially homologous to such a probe and to each other.

[0102] When used in reference to a single-stranded nucleic acidsequence, the term “substantially homologous” refers to any probe whichcan hybridize (i.e., it is the complement of) the single-strandednucleic acid sequence under conditions of low stringency as describedabove.

[0103] As used herein, the term “hybridization” is used in reference tothe pairing of complementary nucleic acids. Hybridization and thestrength of hybridization (i.e., the strength of the association betweenthe nucleic acids) is impacted by such factors as the degree ofcomplementary between the nucleic acids, stringency of the conditionsinvolved, the T_(m) of the formed hybrid, and the G:C ratio within thenucleic acids.

[0104] As used herein, the term “T_(m)” is used in reference to the“melting temperature.” The melting temperature is the temperature atwhich a population of double-stranded nucleic acid molecules becomeshalf dissociated into single strands. The equation for calculating theT_(m) of nucleic acids is well known in the art. As indicated bystandard references, a simple estimate of the T_(m) value may becalculated by the equation: T_(m)=81.5+0.41 (percent G+C), when anucleic acid is in aqueous solution at 1 M NaCl (See e.g., Anderson andYoung, Quantitative Filter Hybridization, in Nucleic Acid Hybridization[1985]). Other references include more sophisticated computations whichtake both structural and sequence characteristics into account for thecalculation of T_(m).

[0105] As used herein the term “stringency” is used in reference to theconditions of temperature, ionic strength, and the presence of othercompounds such as organic solvents, under which nucleic acidhybridizations are conducted. With “high stringency” conditions, nucleicacid base pairing will occur only between nucleic acid fragments thathave a high frequency of complementary base sequences. Thus, conditionsof “weak” or “low” stringency are often required with nucleic acids thatare derived from organisms that are genetically diverse, as thefrequency of complementary sequences is usually less.

[0106] “Amplification” is a special case of nucleic acid replicationinvolving template specificity. It is to be contrasted with non-specifictemplate replication (i.e., replication that is template-dependent butnot dependent on a specific template). Template specificity is heredistinguished from fidelity of replication (ie., synthesis of the properpolynucleotide sequence) and nucleotide (ribo- or deoxyribo-)specificity. Template specificity is frequently described in terms of“target” specificity. Target sequences are “targets” in the sense thatthey are sought to be sorted out from other nucleic acids.

[0107] Template specificity is achieved in most amplification techniquesby the choice of enzyme. Amplification enzymes are enzymes that, underthe conditions they are used, will process only specific sequences ofnucleic acid in a heterogeneous mixture of nucleic acids. For example,in the case of Qβ replicase, MDV-1 RNA is the specific template for thereplicase (Kacian et al., Proc. Natl. Acad. Sci. USA 69:3038 [1972]).Other nucleic acids will not be replicated by this amplification enzyme.Similarly, in the case of T7 RNA polymerase, this amplification enzymehas a stringent specificity for its own promoters (Chamberlin et al.,Nature 228:227 [1970]). In the case of T4 DNA ligase, the enzyme willnot ligate the two oligonucleotides or polynucleotides, where there is amismatch between the oligonucleotide or polynucleotide substrate and thetemplate at the ligation junction (Wu and Wallace, Genomics 4:560[1989]). Finally, Taq and Pfu polymerases, by virtue of their ability tofunction at high temperature, are found to display high specificity forthe sequences bounded and thus defined by the primers; the hightemperature results in thermodynamic conditions that favor primerhybridization with the target sequences and not hybridization withnon-target sequences (Erlich (ed.), PCR Technology, Stockton Press[1989]).

[0108] As used herein, the term “amplifiable nucleic acid” is used inreference to nucleic acids which may be amplified by any amplificationmethod. It is contemplated that “amplifiable nucleic acid” will usuallycomprise “sample template.”

[0109] As used herein, the term “sample template” refers to nucleic acidoriginating from a sample which is analyzed for the presence of “target”(defined below). In contrast, “background template” is used in referenceto nucleic acid other than sample template which may or may not bepresent in a sample. Background template is most often inadvertent. Itmay be the result of carryover, or it may be due to the presence ofnucleic acid contaminants sought to be purified away from the sample.For example, nucleic acids from organisms other than those to bedetected may be present as background in a test sample.

[0110] As used herein, the term “primer” refers to an oligonucleotide,whether occurring naturally as in a purified restriction digest orproduced synthetically, which is capable of acting as a point ofinitiation of synthesis when placed under conditions in which synthesisof a primer extension product which is complementary to a nucleic acidstrand is induced, (i.e., in the presence of nucleotides and an inducingagent such as DNA polymerase and at a suitable temperature and pH). Theprimer is preferably single stranded for maximum efficiency inamplification, but may alternatively be double stranded. If doublestranded, the primer is first treated to separate its strands beforebeing used to prepare extension products. Preferably, the primer is anoligodeoxyribonucleotide. The primer must be sufficiently long to primethe synthesis of extension products in the presence of the inducingagent. The exact lengths of the primers will depend on many factors,including temperature, source of primer and the use of the method.

[0111] As used herein, the term “probe” refers to an oligonucleotide(i.e., a sequence of nucleotides), whether occurring naturally as in apurified restriction digest or produced synthetically, recombinantly orby PCR amplification, which is capable of hybridizing to anotheroligonucleotide of interest. A probe may be single-stranded ordouble-stranded. Probes are useful in the detection, identification andisolation of particular gene sequences. It is contemplated that anyprobe used in the present invention will be labelled with any “reportermolecule,” so that is detectable in any detection system, including, butnot limited to enzyme (e.g., ELISA, as well as enzyme-basedhistochemical assays), fluorescent, radioactive, and luminescentsystems. It is not intended that the present invention be limited to anyparticular detection system or label. The present invention providessequences for suitable for use as probes.

[0112] As used herein, the term “target,” when used in reference to thepolymerase chain reaction, refers to the region of nucleic acid boundedby the primers used for polymerase chain reaction. Thus, the “target” issought to be sorted out from other nucleic acid sequences. A “segment”is defined as a region of nucleic acid within the target sequence.

[0113] As used herein, the term “polymerase chain reaction” (“PCR”)refers to the method of Mullis (See e.g., U.S. Pat. Nos. 4,683,1954,683,202, and 4,965,188, herein incorporated by reference), whichdescribes a method for increasing the concentration of a segment of atarget sequence in a mixture of genomic DNA without cloning orpurification. This process for amplifying the target sequence consistsof introducing a large excess of two oligonucleotide primers to the DNAmixture containing the desired target sequence, followed by a precisesequence of thermal cycling in the presence of a DNA polymerase. The twoprimers are complementary to their respective strands of the doublestranded target sequence. To effect amplification, the mixture isdenatured and the primers then annealed to their complementary sequenceswithin the target molecule. Following annealing, the primers areextended with a polymerase so as to form a new pair of complementarystrands. The steps of denaturation, primer annealing and polymeraseextension can be repeated many times (ie., denaturation, annealing andextension constitute one “cycle”; there can be numerous “cycles”) toobtain a high concentration of an amplified segment of the desiredtarget sequence. The length of the amplified segment of the desiredtarget sequence is determined by the relative positions of the primerswith respect to each other, and therefore, this length is a controllableparameter. By virtue of the repeating aspect of the process, the methodis referred to as the “polymerase chain reaction” (hereinafter “PCR”).Because the desired amplified segments of the target sequence become thepredominant sequences (in terms of concentration) in the mixture, theyare said to be “PCR amplified”.

[0114] With PCR, it is possible to amplify a single copy of a specifictarget sequence in genomic DNA to a level detectable by severaldifferent methodologies (e.g., hybridization with a labeled probe;incorporation of biotinylated primers followed by avidin-enzymeconjugate detection; incorporation of ³²P-labeled deoxynucleotidetriphosphates, such as dCTP or DATP, into the amplified segment). Inaddition to genomic DNA, any oligonucleotide or polynucleotide sequencecan be amplified with the appropriate set of primer molecules. Inparticular, the amplified segments created by the PCR process itselfare, themselves, efficient templates for subsequent PCR amplifications.

[0115] As used herein, the terms “PCR product,” “PCR fragment,” and“amplification product” refer to the resultant mixture of compoundsafter two or more cycles of the PCR steps of denaturation, annealing andextension are complete. These terms encompass the case where there hasbeen amplification of one or more segments of one or more targetsequences.

[0116] As used herein, the term “amplification reagents” refers to thosereagents (deoxyribonucleotide triphosphates, buffer, etc.), needed foramplification except for primers, nucleic acid template and theamplification enzyme. Typically, amplification reagents along with otherreaction components are placed and contained in a reaction vessel (testtube, microwell, etc.).

[0117] As used herein, the terms “restriction endonucleases” and“restriction enzymes” refer to bacterial enzymes, each of which cutdouble-stranded DNA at or near a specific nucleotide sequence.

[0118] As used herein, the term “antisense” is used in reference to RNAsequences which are complementary to a specific cDNA or RNA sequence(e.g., mRNA). Included within this definition are antisensecomplementary RNA (cRNA) molecules produced by an in vitro transcriptionmethod from a CDNA template. The term “antisense strand” is used inreference to a nucleic acid strand that is complementary to the “sense”strand. The designation (−) (i.e., “negative”) is sometimes used inreference to the antisense strand, with the designation (+) sometimesused in reference to the sense (i.e., “positive”) strand.

[0119] As used herein, the term “polyA⁺ RNA” refers to RNA moleculeshaving a stretch of adenine nucleotides at the 3′ end. This polyadeninestretch is also referred to as a “poly-A tail.” Eukaryotic mRNAmolecules contain poly-A tails and are referred to as polyA⁺ RNA.

[0120] The terms “in operable combination,” “in operable order,” and“operably linked” as used herein refer to the linkage of nucleic acidsequences in such a manner that a nucleic acid molecule capable ofdirecting the transcription of a given gene and/or the synthesis of adesired protein molecule is produced. The term also refers to thelinkage of amino acid sequences in such a manner so that a functionalprotein is produced.

[0121] The term “isolated” when used in relation to a nucleic acid, asin “an isolated oligonucleotide” or “isolated polynucleotide” refers toa nucleic acid sequence that is identified and separated from at leastone contaminant nucleic acid with which it is ordinarily associated inits natural source. Isolated nucleic acid is such present in a form orsetting that is different from that in which it is found in nature. Incontrast, non-isolated nucleic acids are nucleic acids such as DNA andRNA found in the state they exist in nature. For example, a given DNAsequence (e.g., a gene) is found on the host cell chromosome inproximity to neighboring genes; RNA sequences, such as a specific mRNAsequence encoding a specific protein, are found in the cell as a mixturewith numerous other mRNAs which encode a multitude of proteins. However,isolated nucleic acids encoding a protein includes, by way of example,such nucleic acids in cells ordinarily expressing the protein where thenucleic acid is in a chromosomal location different from that of naturalcells, or is otherwise flanked by a different nucleic acid sequence thanthat found in nature. The isolated nucleic acid, oligonucleotide, orpolynucleotide may be present in single-stranded or double-strandedform. When an isolated nucleic acid, oligonucleotide or polynucleotideis to be utilized to express a protein, the oligonucleotide orpolynucleotide will contain at a minimum the sense or coding strand(ie., the oligonucleotide or polynucleotide may be single-stranded), butmay contain both the sense and anti-sense strands (i.e., theoligonucleotide or polynucleotide may be double-stranded).

[0122] As used herein, a “portion of a chromosome” refers to a discretesection of the chromosome. Chromosomes are divided into sites orsections by cytogeneticists as follows: the short (relative to thecentromere) arm of a chromosome is termed the “p” arm; the long arm istermed the “q” arm. Each arm is then divided into 2 regions termedregion 1 and region 2 (region 1 is closest to the centromere). Eachregion is further divided into bands. The bands may be further dividedinto sub-bands. For example, the 11p15.5 portion of human chromosome 11is the portion located on chromosome 11 (11) on the short arm (p) in thefirst region (1) in the 5th band (5) in sub-band 5 (0.5). A portion of achromosome may be “altered;” for instance the entire portion may beabsent due to a deletion or may be rearranged (e.g., inversions,translocations, expanded or contracted due to changes in repeatregions). In the case of a deletion, an attempt to hybridize (i.e.,specifically bind) a probe homologous to a particular portion of achromosome could result in a negative result (ie., the probe could notbind to the sample containing genetic material suspected of containingthe missing portion of the chromosome). Thus, hybridization of a probehomologous to a particular portion of a chromosome may be used to detectalterations in a portion of a chromosome.

[0123] The term “sequences associated with a chromosome” meanspreparations of chromosomes (e.g., spreads of metaphase chromosomes),nucleic acid extracted from a sample containing chromosomal DNA (e.g.,preparations of genomic DNA); the RNA which is produced by transcriptionof genes located on a chromosome (e.g., hnRNA and mRNA) and cDNA copiesof the RNA transcribed from the DNA located on a chromosome. Sequencesassociated with a chromosome may be detected by numerous techniquesincluding probing of Southern and Northern blots and iii situhybridization to RNA, DNA or metaphase chromosomes with probescontaining sequences homologous to the nucleic acids in the above listedpreparations.

[0124] As used herein the term “coding region” when used in reference tostructural gene refers to the nucleotide sequences which encode theamino acids found in the nascent polypeptide as a result of translationof a mRNA molecule. The coding region is bounded, in eukaryotes, on the5′ side by the nucleotide triplet “ATG” which encodes the initiatormethionine and on the 3′ side by one of the three triplets which specifystop codons (i.e., TAA, TAG, TGA).

[0125] As used herein, the term “structural gene” refers to a DNAsequence coding for RNA or a protein. In contrast “regulatory genes” arestructural genes which encode products which control the expression ofother genes (e.g., transcription factors).

[0126] As used herein, the term “purified” or “to purify” refers to theremoval of contaminants from a sample. For example, antibodies arepurified by removal of contaminating non-immunoglobulin proteins; theyare also purified by the removal of immunoglobulin that does not bindthe antigen of interest. The removal of non-immunoglobulin proteinsand/or the removal of immunoglobulins that do not bind the antigen ofinterest results in an increase in the percent of immunoglobulins in thesample that bind the antigen of interest. In another example,recombinant polypeptides are expressed in bacterial host cells and thepolypeptides are purified by the removal of host cell proteins; thepercent of recombinant polypeptides is thereby increased in the sample.

[0127] The term “recombinant DNA molecule” as used herein refers to aDNA molecule which is comprised of segments of DNA joined together bymeans of molecular biological techniques.

[0128] The term “recombinant protein” or “recombinant polypeptide” asused herein refers to a protein molecule which is expressed from arecombinant DNA molecule.

[0129] The term “native protein” as used herein to indicate that aprotein does not contain amino acid residues encoded by vectorsequences; that is the native protein contains only those amino acidsfound in the protein as it occurs in nature. A native protein may beproduced by recombinant means or may be isolated from a naturallyoccurring source.

[0130] As used herein the term “portion” when in reference to a protein(as in “a portion of a given protein”) refers to fragments of thatprotein. The fragments may range in size from four amino acid residuesto the entire amino acid sequence minus one amino acid.

[0131] As used herein, the term “fusion protein” refers to a chimericprotein containing the protein of interest (or fragments thereof) joinedto an exogenous protein fragment. The fusion partner may enhancesolubility of the protein of interest as expressed in a host cell, mayprovide an affinity tag to allow purification of the recombinant fusionprotein from the host cell or culture supernatant, or both. If desired,the fusion protein may be removed from the protein of interest by avariety of enzymatic or chemical means known to the art.

[0132] The term “Southern blot,” refers to the analysis of DNA onagarose or acrylamide gels to fractionate the DNA according to sizefollowed by transfer of the DNA from the gel to a solid support, such asnitrocellulose or a nylon membrane. The immobilized DNA is then probedwith a labeled probe to detect DNA species complementary to the probeused. The DNA may be cleaved with restriction enzymes prior toelectrophoresis. Following electrophoresis, the DNA may be partiallydepurinated and denatured prior to or during transfer to the solidsupport. Southern blots are a standard tool of molecular biologists (J.Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold SpringHarbor Press, NY, pp 9.31-9.58 [1989]).

[0133] The term “Northern blot,” as used herein refers to the analysisof RNA by electrophoresis of RNA on agarose gels to fractionate the RNAaccording to size followed by transfer of the RNA from the gel to asolid support, such as nitrocellulose or a nylon membrane. Theimmobilized RNA is then probed with a labeled probe to detect RNAspecies complementary to the probe used. Northern blots are a standardtool of molecular biologists (Sambrook et al., supra, pp 7.39-7.52[1989]).

[0134] The term “Western blot” refers to the analysis of protein(s) (orpolypeptides) immobilized onto a support such as nitrocellulose or amembrane. The proteins are run on acrylamide gels to separate theproteins, followed by transfer of the protein from the gel to a solidsupport, such as nitrocellulose or a nylon membrane. The immobilizedproteins are then exposed to antibodies with reactivity against anantigen of interest. The binding of the antibodies may be detected byvarious methods, including the use of radiolabelled antibodies.

[0135] The term “antigenic determinant” as used herein refers to thatportion of an antigen that makes contact with a particular antibody(i.e., an epitope). When a protein or fragment of a protein is used toimmunize a host animal., numerous regions of the protein may induce theproduction of antibodies which bind specifically to a given region orthree-dimensional structure on the protein; these regions or structuresare referred to as antigenic determinants. An antigenic determinant maycompete with the intact antigen (i.e., the “immunogen” used to elicitthe immune response) for binding to an antibody.

[0136] The terms “specific binding” and specifically binding” when usedin reference to the interaction of an antibody and a protein or peptidemeans that the interaction is dependent upon the presence of aparticular structure (i.e., the antigenic determinant or epitope) on theprotein; in other words the antibody is recognizing and binding to aspecific protein structure rather than to proteins in general. Forexample, if an antibody is specific for epitope “A,” the presence of aprotein containing epitope A (or free, unlabelled A) in a reactioncontaining labelled “A” and the antibody will reduce the amount oflabelled A bound to the antibody.

[0137] The present invention also contemplates “non-human animals”comprising any non-human animal capable of overexpressing mRNA and/orproteins of interest. Such non-human animals include vertebrates such asrodents, non-human primates, ovines, bovines, ruminants, lagomorphs,porcines, caprines, equines, canines, felines, aves, etc. Preferrednon-human animals are selected from the order Rodentia, most preferablymice. The term “order Rodentia” refers to rodents (i.e., placentalmammals [Class Euthria] which include the family Muridae (rats andmice).

[0138] The “non-human animals having a genetically engineered genotype”of the invention are preferably produced by experimental manipulation ofthe genome of the germline of the non-human animal. These geneticallyengineered non-human animals may be produced by several methodsincluding the introduction of a “transgene” comprising nucleic acid(usually DNA) into an embryonal target cell or integration into achromosome of the somatic and/or germ line cells of a non-human animalby way of human intervention, such as by the methods described herein.Non-human animals which contain a transgene are referred to as“transgenic non-human animals.” A transgenic animal is an animal whosegenome has been altered by the introduction of a transgene.

[0139] The term “transgene” as used herein refers to a foreign gene thatis placed into an organism by introducing the foreign gene into newlyfertilized eggs or early embryos. The term “foreign gene” refers to anynucleic acid (e.g., gene sequence) which is introduced into the genomeof an animal by experimental manipulations and may include genesequences found in that animal so long as the introduced gene does notreside in the same location as does the naturally-occurring gene.

[0140] As used herein, the term “vector” is used in reference to nucleicacid molecules that transfer DNA segment(s) from one cell to another.The term “vehicle” is sometimes used interchangeably with “vector.”

[0141] The term “expression vector” as used herein refers to arecombinant DNA molecule containing a desired coding sequence andappropriate nucleic acid sequences necessary for the expression of theoperably linked coding sequence in a particular host organism. Nucleicacid sequences necessary for expression in prokaryotes usually include apromoter, an operator (optional), and a ribosome binding site, oftenalong with other sequences. Eukaryotic cells are known to utilizepromoters, enhancers, and termination and polyadenylation signals.

[0142] The terms “overexpression” and “overexpressing” and grammaticalequivalents, are used in reference to levels of mRNA to indicate a levelof expression approximately 3-fold higher than that typically observedin a given tissue in a control or non-transgenic animal. Levels of mRNAare measured using any of a number of techniques known to those skilledin the art including, but not limited to Northern blot analysis.Appropriate controls are included on the Northern blot to control fordifferences in the amount of RNA loaded from each tissue analyzed (e.g.,the amount of 28S rRNA, an abundant RNA transcript present atessentially the same amount in all tissues, present in each sample canbe used as a means of normalizing or standardizing the protein ofinterest mRNA-specific signal observed on Northern blots). The amount ofmRNA present in the band corresponding in size to the correctly splicedprotein transgene RNA is quantified; other minor species of RNA whichhybridize to the transgene probe are not considered in thequantification of the expression of the transgenic mRNA The term“transfection” as used herein refers to the introduction of foreign DNAinto eukaryotic cells. Transfection may be accomplished by a variety ofmeans known to the art including calcium phosphate-DNA co-precipitation,DEAE-dextran-mediated transfection, polybrene-mediated transfection,electroporation, microinjection, liposome fusion, lipofection,protoplast fusion, retroviral infection, and biolistics.

[0143] The term “stable transfection” or “stably transfected” refers tothe introduction and integration of foreign DNA into the genome of thetransfected cell. The term “stable transfectant” refers to a cell whichhas stably integrated foreign DNA into the genomic DNA.

[0144] The term “transient transfection” or “transiently transfected”refers to the introduction of foreign DNA into a cell where the foreignDNA fails to integrate into the genome of the transfected cell. Theforeign DNA persists in the nucleus of the transfected cell for severaldays. During this time the foreign DNA is subject to the regulatorycontrols that govern the expression of endogenous genes in thechromosomes. The term “transient transfectant” refers to cells whichhave taken up foreign DNA but have failed to integrate this DNA.

[0145] As used herein, the term “selectable marker” refers to the use ofa gene which encodes an enzymatic activity that confers the ability togrow in medium lacking what would otherwise be an essential nutrient(e.g., the HIS3 gene in yeast cells); in addition, a selectable markermay confer resistance to an antibiotic or drug upon the cell in whichthe selectable marker is expressed. Selectable markers may be“dominant”; a dominant selectable marker encodes an enzymatic activitywhich can be detected in any eukaryotic cell line. Examples of dominantselectable markers include the bacterial aminoglycoside 3′phosphotransferase gene (also referred to as the neo gene) which confersresistance to the drug G418 in mammalian cells, the bacterial hygromycinG phosphotransferase (hyg) gene which confers resistance to theantibiotic hygromycin and the bacterial xanthine-guanine phosphoribosyltransferase gene (also referred to as the gpt gene) which confers theability to grow in the presence of mycophenolic acid. Other selectablemarkers are not dominant in that there use must be in conjunction with acell line that lacks the relevant enzyme activity. Examples ofnon-dominant selectable markers include the thymidine kinase (tk) genewhich is used in conjunction with tk⁻ cell lines, the CAD gene which isused in conjunction with CAD-deficient cells and the mammalianhypoxanthine-guanine phosphoribosyl transferase (hprt) gene which isused in conjunction with hprr⁻ cell lines. A review of the use ofselectable markers in mammalian cell lines is provided in Sambrook etal., Molecular Cloning: A Laboratory Manual., 2nd ed., Cold SpringHarbor Laboratory Press, New York (1989) pp.16.9-16.15.

[0146] As used herein, the term “cell culture” refers to any in vitroculture of cells. Included within this term are continuous cell lines(e.g., with an immortal phenotype), primary cell cultures, finite celllines (e.g., non-transformed cells), and any other cell populationmaintained in vitro.

[0147] The terms “compound” and “test compound” refer to any chemicalentity, pharmaceutical, drug, and the like that can be used to treat orprevent a disease, illness, sickness, or disorder of bodily function.Compounds comprise both known and potential therapeutic compounds. Acompound can be determined to be therapeutic by screening using thescreening methods of the present invention. A “known therapeuticcompound” refers to a therapeutic compound that has been shown (e.g.,through animal trials or prior experience with administration to humans)to be effective in such treatment. In other words, a known therapeuticcompound is not limited to a compound efficacious in the treatment ofcancer.

[0148] A “composition comprising a given polynucleotide sequence” asused herein refers broadly to any composition containing the givenpolynucleotide sequence. The composition may comprise an aqueoussolution.

[0149] The term “sample” as used herein is used in its broadest sense. Asample suspected of containing a human chromosome or sequencesassociated with a human chromosome may comprise a cell, chromosomesisolated from a cell (e.g., a spread of metaphase chromosomes), genomicDNA (in solution or bound to a solid support such as for Southern blotanalysis), RNA (in solution or bound to a solid support such as forNorthern blot analysis), cDNA (in solution or bound to a solid support)and the like. A sample suspected of containing a protein may comprise acell, a portion of a tissue, an extract containing one or more proteinsand the like.

EXPERIMENTAL

[0150] The following examples are provided in order to demonstrate andfurther illustrate certain preferred embodiments and aspects of thepresent invention and are not to be construed as limiting the scopethereof. In the experimental disclosure which follows, the followingscientific abbreviations/notations apply: lod (log of odds); PFC(prefontal cortex); amygdala (AMY); SNP (single nucleotidepolymorphism); ° C. (degrees Centigrade); rpm (revolutions per minute);BSA (bovine serum albumin); CFA (complete Freund's adjuvant); IFA(incomplete Freund's adjuvant); IgG (immunoglobulin G); IM(intramuscular); IP (intraperitoneal); IV (intravenous orintravascular); SC (subcutaneous); H₂O (water); HCl (hydrochloric acid);aa (amino acid); bp (base pair); kb (ilobase pair); kD (kilodaltons); cM(centimorgans); gm or g (grams); μg (micrograms); mg (milligrams); ng(nanograms); μl (microliters); ml (milliliters); mm (millimeters); nm(nanometers); μm (micrometer); M (molar); mM (millimolar); μM(micromolar); U (units); V (volts); MW (molecular weight); sec(seconds); min(s) (minute/minutes); hr(s) (hour/hours); MgCl₂ (magnesiumchloride); NaCl (sodium chloride); OD280 (optical density at 280 μm);OD₆₀₀ (optical density at 600 μm); PAGE (polyacrylamide gelelectrophoresis); PBS (phosphate buffered saline [150 mM NaCl, 10 mMsodium phosphate buffer, pH 7.2]); PCR (polymerase chain reaction); PEG(polyethylene glycol); PMSF (phenylmethylsulfonyl fluoride); SDS (sodiumdodecyl sulfate); w/v (weight to volume); v/v (volume to volume);

[0151] As used herein, the following abbreviations also apply: ABI(Applied Biosystems, Fosterter City, Calif.); Affymetrix (Affymetrix,Santa Clara, Calif.); Santa Cruz (Santa Cruz Biologicals, Santa Cruz,Calif.); Amersham (Amersham Pharmacia Biotech, Piscataway, N.J.); Amicon(Amicon, Inc., Beverly, Mass.); ATCC (American Type Culture Collection,Rockville, Md.); BioRad (BioRad, Richmond, Calif.); Clontech (CLONTECHLaboratories, Palo Alto, Calif.); GIBCO BRL or Gibco BRL (LifeTechnologies, Inc., Gaithersburg, Md.); Hewlett-Packard (Hewlett-PackardCompany, Palo Alto, Calif.); Invitrogen (Invitrogen-Novex, San Diego,Calif.); Molecular Dynamics (Molecular Dynamics, Sunnyvale, Calif.); NewEngland Biolabs (New England Biolabs, Inc., Beverly, Mass.); Novagen(Novagen, Inc., Madison, Wis.); Perlin Elmer (PE Biosystems, FosterCity, Calif.); Promega (Promega Corp, Madison, Wis.); Sigma (SigmaChemical Co., St. Louis, Mo.); Stratagene (Stratagene Cloning Systems,La Jolla, Calif.); Sun (Sun Microsystems Inc., Palo Alto, Calif.); andWeizmann Institute (Weizmann Institute of Science, Rehovot, Israel).

Example 1 Amphetamine Treatment

[0152] In these experiments, a rat animal model was used to identifysusceptibility genes. These experiments were done twice, independently,with different sets of animals and at different times, to assessreproducibility.

[0153] Three Sprague Dawley rats were treated with 4 mg/kg amphetamine,while another three rats were treated with normal saline injection(i.e., negative-control animals). After 24 hours, the rats were humanelysacrificed and the brains were harvested.

Example 2 Tissue Testing and Analysis

[0154] In these experiments, the brain tissues obtained from the ratsdescribed in Example 1 were processed and tested. Samples were handledaccording to the recommendation of Affymetrix, the manufacturer of theGeneChips used during the development present invention In theexperiments described in greater detail below, the Affymetrix U34A chip,which measures 7,000 cDNAs and 1,000 ESTs, was used. The analyses wereconducted at the University of California, San Diego/Veteran'sAdministration Center GeneChip Core Facility.

[0155] Tissues from each brain region from the three animals in eachexperimental group were pooled (i.e., test and control animals). TotalRNA was isolated from the tissue using standard protocols known in theart. Briefly, STAT-60 extraction buffer, and phenol/chloroformextraction was used. cDNA was synthesized and used as templates toproduce biotin-labeled antisense cRNAs using an in vitro transcriptionreaction. After fragmentation, the CRNA hybridization cocktail wasprepared, cleaned, and applied to the Affymetrix GeneChipoligonucleotide array. The loaded GeneChip was incubated overnight in aGeneChip hybridization oven. Immediately following hybridization, theprobe array was washed and then stained with astreptavidin-phycoerythrin (SAPE) fluorescence tag. The GeneChipFluidics Station was used to automate the washing steps to removenon-specifically bound cRNA and stain.

[0156] Once the probe array was hybridized, stained and washed, it wasscanned using an Hewlett-Packard GeneArray scanner. The GeneChipOperating System, running on a PC workstation, controlled the scannerfunctions and collected fluorescence intensity data. Data were processedusing GeneChip expression analysis software from Affymetrix. A two-foldincrease or decrease in expression was chosen as a conventionalempirical cut-off. Thus, at least a two-fold change in each of twoindependent animal experiments was used to select those genes with themost robust and reproducible change in expression. In these analyses,standard default settings of the Affymetrix GeneChip ExpressionAlgorithm were used. A gene had to be called “Present” and “Changed,” inat least one out of two experiments and had to have an AverageDifference Change greater than 50, as well as a fold change greater than2 in two out of two experiments. Genes meeting this criteria aresummarized in Table 1, for the prefrontal cortex (PFC) and Table 2, forthe amygdala (AMY). The genes that were induced more than two-fold inboth experiments were also identified by their GenBank accessionnumbers, as indicated in the Tables. A gene was scored as mapping to alinkage region for either schizophrenia (S) or bipolar disorder (13) ifits human homologue mapped to within 10 cM of a marker for which atleast suggestive evidence of linkage had been reported.

[0157] The chromosomal locations of the human homologues of these geneswere then compared with published linkage reports for bipolar disorderand schizophrenia, as well as data generated during the development ofthe present invention to cross-validate the results and identifyhigh-probability candidate genes. The human homologues and humanchromosomal map locations were determined using the NCBI database.GeneCard (Weizmann Institute), a comprehensive database containing allof the various information available regarding known genes and theirfunctions was also used for each gene identified in the screen. Geneswere considered to be positional candidates (i.e., close to a genomichotspot) if they mapped to within 10 cM of a marker for which there wasat least one report of suggestive evidence of linkage (Lander andKruglyak, Nat. Genet., 11:241 [1995]). The Marshfield integrated linkagemap was used as a reference for genetic location. As shown in Tables 1and 2, eight of these genes met the criteria used in the analysesdescribed in the Examples herein. It was also noted that a number ofinteresting genes were very narrowly positioned below this threshold Anindication of the specificity of the result is that GRK2, a closehomologue of GRK3, demonstrated no change in expression in eitherexperiment (fold changes of 1.1 and 1.0 in two experiments).

Example 3 Mutation Screening of GRK3

[0158] In these experiments, portions of the GRK3 gene locus wereamplified and directly sequenced from 14 bipolar patients and 6 controlsubjects.

[0159] The GRK3 gene spans 21 exons over 170 kb. Using the availablegenomic sequence, PCR primers were designed so as to individuallyamplify each of the 21 exons including approximately 200 bp of flankingintronic sequence which contains splicing signals. Primers were alsodesigned to amplify approximately 1.6 kb in the 5′ promoter region infour overlapping segments. In order to enrich the sample to be screenedfor those subjects most likely to contain a functional mutation in theGRK3 gene, families were identified from the 20 families that were partof an earlier genome scan and the 57 NIMH families which showed apositive lod score for the marker D22S419. Fourteen such bipolarsubjects were identified and their DNA was PCR amplified for each ofthese regions. In addition to these 14 affected subjects from familieswith evidence of linkage, another set of 6 control subjects werescreened in order to identify high frequency anonymous sequence variantsin introns suitable as markers for linkage disequilibrium studies. Thesedouble stranded PCR fragments were then sequenced directly using cyclesequencing and fluorescent detection. Sequencing reactions wereelectrophoretically separated and detected using an ABI 377 automatedDNA sequencer. The resulting electropherograms were analyzed for singlenucleotide polymorphisms (SNPs) as both homozygotes and heterozygotesusing the software package PolyPhred (Nickerson et al., Nucleic AcidsRes 25:2745-2751 [1997]).

[0160] The results of these experiments are summarized in FIGS. 1 and 2.No coding sequence SNPs were detected. Nor were any SNPs detected inprobable splice signals. However, six SNPs were detected in the probablepromoter of the gene. Two of these SNPs occur within 400 bp of thetranslation start site, while the others occur within approximately 0.9kb, 1.2 kb and 1.3 kb of the translation start site. As a first approachto examining the possible functional impact of these SNPs, the 1.6 kb ofsequence 5′ to the ATG translation start site was compared to theTRASFAC database using the NSITE program available at the Sanger Centreweb site (http://genomic.sanger.ac.uklgf/gf.shtml). Three of the fourSNPs occur in potential transcription factor binding sites. The most 3′of the sites (515b) lies at the base of a palindrome predicted to forman mRNA hairpin with a 14 bp stem. 5′ UTR hairpins have been shown tofunction as translational regulatory elements. Although this analysis isspeculative, it is consistent with this region being the promoter and apossible effect on transcription by these SNPs.

Example 4 Linkage Disequilibrium Studies of GRK3

[0161] Sample 1 United States Triads

[0162] Four of the SNPs (e.g., 514a, 514b, 515a and 515b) identified inthe GRK3 promoter region of the bipolar patients of Example 3 wereexamined for genetic association to bipolar disorder. In addition, fourhigh frequency anonymous SNPs identified from the control subjects ofExample 3 were also examined. Two of the latter SNPs are located 28 kb5′ to the GRK3 translation start site and two are located 110 and 150 kb3′ to the start of translation.

[0163] The seven single bp substitution SNPs were genotyped by theTaqMan allele specific assay method (Perkin Elmer) according to themanufacturer's protocols. For each site, primer pairs flanking the siteto be interrogated were selected for PCR amplification of fragments ofless than 150 bp. Two dual labeled probes centered on the SNP anddiffering in sequence by the one bp polymorphism of the SNP site itselfwere designed. The probes were labeled with 5′ reporter fluors FAM orTET and 3′ quencher TAMARA Sensitivity and specificity for allelicdiscrimination was tested over a wide range of primer and probeconcentrations on the DNA samples whose allele type was previouslydetermined. Concentrations and cycling parameters were chosen forgenotyping that produced clustered values for heterozygotes whichseparated from homozygotes by greater than 4 standard deviations. Anysamples which gave ambiguous calls were retyped. Accuracy of typing waschecked by retyping 450 samples; no incorrect calls were detected.TaqMan reagents could not be developed for the 5′-UTR deletion variant(located at −130 bp). Instead this variant was typed by standardsize-based methods commonly used for microsatellite genotyping. AFAM-labeled forward and unlabelled reverse primer pair were used toamplify a 228 bp genomic fragment spanning the variant. The one bpdeletion was detected by size discrimination on a sequencing gel. Allgenotypes were read in a machine assisted fashion using ABI software andconfirmed by two independent human readers.

[0164] Each SNP was genotyped in a set of 120 Caucasian pedigrees; 62 ofthese pedigrees consisted of parent and offspring trios and 58 pedigreesconsisted of 2 or more siblings plus parents. In both types of families,the affected offspring were diagnosed with either bipolar I or bipolarII disorder. Therefore, there were a total of 181 triad familiesextracted from the 120 pedigrees. Allele frequencies for each of themarkers in this set of pedigrees are listed in the Table 3. Transmissiondisequilibrium tests were carried out using the program TDTLIKE(Terwilliger, Am J Hum Genet 56:777-787 [1995]). Using this program,transmitted and untransmitted alleles are counted from each heterozygoteparent to an affected offspring. This method only counts transmissionswhere both parents have genotype information. Using TDTLIKE, a McNemarchi-square test statistic and associated one-sided p-values werecomputed. Two SNPs had nominal p-values less than 0.05. As shown inTable 3, allele “1” for marker 514a had 18 transmitted versus 5untransmitted alleles (chi-square=7.34, p-value=0.007). In addition,allele “1” in marker 515a had 13 transmitted versus 4 untransmittedalleles (chi-square=4.8, p-value=0.03). P-values were also empiricallycomputed using 10,000 replications as carried out using the programGASSOC v. 1.05 (Schaid, Genet Epidemiol 16:250-260 [1999]). Theempirical p-values were similar to those derived from the chi-squarestatistic (p=0.009 for 514a, p=0.04 for A515a, respectively). With bothmarkers, the associated allele had a frequency of less than 5 percent inthis population. These results do show evidence for excess transmissionin these two SNPs in the promoter region of GRK3 in this pedigree set.It must also be noted that the inclusion of multiple sibs in some ofthese families may make this in part a test of linkage, as well aslinkage disequilibrium. Only six haplotypes for these four SNPs wereobserved (See, Table 4) indicating a high degree of linkagedisequilibrium. As they are all in tight linkage disequilibrium, it isnot possible to determine which of the three are most likely to befunctionally relevant, or to exclude the possibility that the functionalSNP is some other nearby variant not yet identified. However, analysesof the other SNPs approximately 40 kb upstream or 110 kb and 150 kbdownstream were uniformly negative thereby bracketing the region forassociation to the vicinity of the promoter. TABLE 3 TDT Analysis ofGRK3 SNPs in Sample 1 SNP Location¹ Allele² Frequency T³ N³ χ² p-valueA486a −28 kb 1 0.22 63 60 0.07  n.s.⁴ A486b −28 kb 2 0.72 81 71 0.66n.s. A514a −1330 bp  1 0.04 18 5 7.35 0.007 A514b −1306 bp  1 0.99 5 40.11 n.s. A515a −383 bp 1 0.03 13 4 4.76 0.03 A515b −110 bp 1 0.02 9 60.6 n.s. A630 110 kb 2 0.37 80 75 0.16 n.s. A665 150 kb 2 0.45 85 820.05 n.s.

[0165] TABLE 4 GRK3 Promoter Haplotypes in Sample 1 514a 514b 515a 515b#¹ T² N² + 5 4 1 + + 2 2 0 + + + 9 6 2 + + 1 1 0 + 1  1⁴  1⁴ + 8  5⁴  5⁴

[0166] Sample 2—Canadian Triads

[0167] As described above for the triads in Sample 1, two of the SNPsidentified in the GRK3 promoter region of the bipolar patients ofExample 3 were examined for genetic association to bipolar disorder in asecond sample of 248 triads. The SNPs genotypes were SNPs 514a and 515awhich are approximately 1300 and 300 bp upstream from the ATG, and about100 and 1100 bp upstream from the approximate transcription start site.These SNPs yielded evidence of association to bipolar disorder in thefirst sample of 150 triads. SNPs 514a and 515a were genotyped using theTaqMan method and analyzed for association using the TDT. The resultsare summarized in Table 5. TABLE 5 TDT Analysis of GRK3 SNPs in Sample 2Caucasian N. European Non N. Non (210) European (33) Caucasian (5) p SNPν¹ T N χ² value T N T N 514a 0.08 18 12 1.2 n.s. 2 4 0 2 515a 0.04 10 33.8 0.05 2 3 1 2 Haplotype 514a 515a + − 9 9 1 1 0 1 + + 9 3 1 3 0 1 − +1 0 1 0 1 1

[0168] These results are consistent with those observed in Sample 1which included both University of California, San Diego and NationalInstitutes of Mental Health families.

[0169] SNP 515a demonstrated an approximately three-fold greater rate oftransmission compared to nontransmission (10:3). This resulted in a χ²of 3.8 and a nominal p value of 0.05. In contrast to Sample 1 where SNP514a gave the strongest results, it was non-significant in Sample 2.However, it is SNP 515a that is much closer to transcription initiationand therefore, more likely to be of functional consequence. Theseresults were strongest in Caucasians of Northern European ancestry.

[0170] An analysis of the combined sample of 398 families is summarizedbelow in Table 6. In families of Northern European ancestry, 515a wasagain the strongest with a χ² of 8.5 and p value of 0.004, and anapproximately 3 fold excess of transmission to non-transmission.However, 514a was also nominally significant with a χ² of 6.8 and pvalue of 0.01. TABLE 6 TDT Analysis of GRK3 SNPs in Samples 1 and 2Caucasian N. European Non N. European Non Caucasian (329) (34) (35) SNPT N X² p value T N T N 514a 36 17 6.8 0.01 2 5 5 7 515a 23 7 8.5 0.004 23 4 5

EXAMPLE 5 GRK3 Protein Expression in Lymphoblastoid Cell Lines

[0171] In these experiments, GRK3 protein expression levels in cellsfrom bipolar members of families with evidence of linkage to chromosome22q11 and normal controls were tested. As GRK3 is expressed inlymphoblastoid cell lines, it is possible to measure levels of GRK3message and protein directly in cell lines from patients most likely tohave the mutation. Lymphoblastoid cells from bipolar I patients from theUCSD Bipolar Genetics Study cohort and normal controls were used at asimilar degree of previous expansion (approximately passage 2 afterimmortalization with Epstein-Barr virus). Each bipolar patient came froma family with a lod score of >0.3 at D22S419 on chromosome 22.

[0172] Cells were grown in RPMI medium containing 10% fetal bovine serumand incubated at approximately 37° C., with 5% CO₂, to a cell density of1×10⁶ cells/ml. The cells were lysed in lysis buffer (20 mM Tris pH 7.5,150 mM NaCl, 10 mM EDTA, 1% Triton-X 100, 1% sodium deoxycholate, 1 mMPMSF, 10 μg/ml benzamidine, 10 μg/ml leupeptin, 10 μg/ml soybean trypsininhibitor, 5 μg/ml aprotinin, 1 μg/ml pepstatin A, 10 mM sodiumpyrophosphate, 1 mM sodium orthovanadate, and 1 mM NaF).

[0173] The total gel protein was also determined. The proteinconcentration was determined using the Bradford method (Bio-Rad). Then,100 μg of total cell lysates were resolved by SDS-PAGE on a 7% pre-castgel (NuPAGE, Invitrogen-Novex), and transferred to PVDF membranesInvitrogen-Novex). The blot was incubated in the primary antibody at 4°C., overnight (anti-GRK3 goat polyclonal IgG, E-15, sc-9306, Santa Cruz,{fraction (1/200)} dilution), and then with a horseradishperoxidase-conjugated second antibody (anti-goat HRP, sc-2033, SantaCruz, 115000 dilution) for 1 hour. The bound antibodies were visualizedby enhanced chemiluminescence, using the protocols recommended by themanufacturer (Amersham). The specificity of the antibody was verified byWestern analysis using purified GRK2 and GRK3 proteins. The molecularweight of the detected bands was consistent with that of the purifiedprotein.

[0174]FIG. 3 shows a Western blot in which an antibody specific for GRK3was used (sc-9306). In this Figure, “bipolar” indicates bipolar membersof families with linkage to chromosome 22q11, while “control” indicatesnormal controls. The “mw” indicates the lane containing molecular weightstandards. A significant decrease in GRK3 was observed in 3 out of 6probands, as compared to controls. Three additional control subjectswere examined on a separate blot (not shown) and demonstrated GRK3levels comparable to that of the controls shown in FIG. 3.

Example 6 GRK3 Protein Expression in Brain Derived Cell Lines

[0175] A neuroblastoma cell line (SK-N-MC) that endogenously expressesGRK3 and demonstrates desensitization to dopamine stimulation has beenidentified as a suitable model system for studies of transcriptionalregulation. A separate sets of PCR primers specific to GRK2 and forGRK3, that span a 300 bp region including exons 11 and 12, have beendeveloped. cDNAs for GRK2 and for GRK3 were separately and specificallyamplified by RT-PCR from SK-N-MC cells and confirmed by sequencing. Theendogenous expression of GRK3 in SK-M-MC was further confirmed byimmunoblotting cell lysates and probing them with a mouse monoclonalantibody that recognizes both GRK2 and GRK3 [C5/1 1:1000] (Dautzenberget al., Am J. Physiol. [2001]; and Dautzenberg and Hauger,Neurophamiacology [2001]). ECL+Plus detection was performed (Amersham)and blots analyzed on the STORM imager using ImageQuant software(Molecular Dynamics).

[0176] As shown in FIG. 4, GRK3 protein was detected in both SK-N-MCcells and in a retinoblastoma (Y79) cell line. Blots were run withpurified protein standards for GRK3 and GRK2 that migrated to ˜78 kD and˜80 kD, respectively, the known molecular weights of these kinases.Well-defined SK-N-MC and Y79 cell lysate bands that migrated to aposition parallel to the GRK3 standard were identified as GRK3 protein.However, no immunoreactive bands in these two cell lines were detectedat the position of the GRK2 standard. In addition, GRK2 and GRK3 proteinwere not detected in rat amygdalar AR5 cells Mulchahey et al.,Endocrinology 140:251-259 [1999]). The use of other GRK2-specific andGRK3-specific polyclonal and monoclonal antibodies is also contemplated(Dautzenberg et al., Am J. Physiol. [2001]; Dautzenberg and Hauger,Neuropharmacology [2001]; and Oppermann et al., Proc Natl Acad Sci USA93:7649-7654 [1996]).

Example 7 Identification of the GRK3 Transcription Start Site

[0177] Two principal kinds of evidence indicate that it is very likelythat GRK3 transcription is initiated within the ˜1,600 base pairs ofupstream sequence that has been examined in Example 3. First, the GRK3upstream sequences strongly resemble those of the closely related geneGRK2. For GRK2, the region immediately 5′ to the first exon has beenshown to contain multiple transcriptional start sites (Penn and Benvic,J Biol Chem 269:14924-14930 [1994]). In the GRK2 work, a majortranscription start site was identified at −245 bp relative to the ATGat which translation is initiated, plus 6 additional minor starts from−47 to −232 bp. In addition, this region has been shown to have promoteractivity in multiple cell types that express GRK2 endogenously(Ramos-Ruiz et al., Circulation 101:2083-2089 [2000]). Second, both GRK2and GRK3 have similar GC-rich regions within 0.5 kb upstream from thestart of their open reading frames, strongly suggestive of SP1 sitestypically associated with transcriptional initiation in promoterslacking TATA elements. In GRK3, these GC-rich regions extend from about500 bp upstream of the ATG of the first coding exon through the firstexon itself. The domain from −500 bp through -200 bp has ˜75% GCcontent, the next 200 bp consists of ˜90% GC, and the first 113 bp ofthe open reading frame are ˜70% GC. Therefore, by analogy to GRK2 andfrom the presence of typical elements associated with transcriptionalinitiation, GRK3 transcription is likely to start within ˜500 bp of theopen reading frame.

[0178] A human neuroblastoma cell line (SK-N-MC) that endogenouslyexpresses GRK3 is used for functional studies of GRK3 expression. Thefirst approach contemplated for the identification of transcriptionstart sites involves amplifying and sequencing the 5′ end of the GRK3mRNA by a “run-off” reverse transcription reaction. This approachpermits the length and identity of the 5′ end of the transcript to bedetermined and indicates whether the first coding exon is truly thefirst exon of the gene or whether there is an additional upstreamintron. Using the “rapid amplification of cDNA ends” (RACE) procedure(GIBCO BRL), a GRK3 gene specific reverse primer and a high temperaturereverse transcriptase (ThermoScript, GIBCO BRL) are used to make a cDNAcopy of the 5′ end of the GRK3 mRNA. This cDNA is tailed with oligo-dCusing terminal transferase, and amplified with the GIBCO BRL forwardanchor (poly T) primer and a nested GRK3 gene specific primer. Theproduct is then either sequenced directly, or cloned into a suitablevector (GIBCO BRL).

[0179] To confirm that the cDNA end identified by RACE is indeed themRNA terminus, RNase protection assays are contemplated. Riboprobes areprepared from overlapping, ˜300 bp fragments of the GRK3 promoter regionimmediately upstream of and encompassing the putative transcriptionstart sites, which have been cloned into a T7/T3 transcription vector. Aseries of RNase protection assays are conducted in order to identify the5′ extent of exon one. True sites for transcription initiation areconfirmed by the coincidence of the 5′ end of the RACE clones and the 5′extent of RNase protection.

[0180] Further confirmation and information regarding the approximatelength of GRK3 mRNAs are obtained by performing a Northern blot on RNAfrom SK-N-MC cells. A Northern, published in 1991 (Benovic et al. FEBSLett 283:122-126 [1991], indicated a major transcript of 8 kb. TheSanger Centre database predicts polyA sites which would yield mRNAs of˜2500, ˜3500, and ˜7500 bp. Because the open reading frame encompassesonly 2064 bp, the presence of an abundant 2500 bp transcript places thelocation of the transcription start site within the expected upstreamregion. If longer transcripts are present, information from the Northerndoes not definitively confirm data from the other studies, but providesuseful information regarding message size and processing.

Example 8 GRK3 Promoter Studies

[0181] In this Example, methods for examining GRK3 promoter function aredescribed. A 1.5 kb region was amplified from DNA from a subject lackingthe promoter SNPs described in Example 3. This region extended fromapproximately 20 bp upstream of the ATG to −1.5 kb. Restriction sitesplaced on the primers were used to ligate this product into the multiplecloning site of the pGL3 Basic vector (Promega). This vector includes afirefly luciferase open reading frame downstream from the multiplecloning site, and is designed for transfection studies of promoterfunction. In addition to this construct, pGL3 Basic without insert wasused as a negative control and a pGL3 construct with a SV40 promoter andenhancer was used as a positive control. These constructs were incubatedat a 2:1 ratio of 2 μl Superfect (Stratagene) to 1 μg DNA for 10minutes. This mixture was then added to plates of SK-N-MC neuroblastomacells for two hours per the manufacturer's recommendations. The mediumwas then changed and the incubation continued for 24 hours at which timethe cells were lysed and luciferase activity measured in a luminometer.Each experiment was conducted in five replicate plates.

[0182] As shown in Table 7, the pGL3 construct with the GRK3 promotershowed a 5-8 fold increase in luciferase activity over the pGL3 Basicnull vector. These results are consistent with this region havingpromoter function for the GRK3 gene and are similar to results reportedfor the GRK2 promoter which showed an approximately 10-20 fold increasein activity. The use of transfection efficiency controls such asβ-galactosidase or Renilla luciferase is contemplated. Testing thepromoter activity of a series of 5′-deletions of variable lengthspanning this region, is contemplated to define the minimal regionnecessary to confer transcriptional activity. TABLE 7 Results ofTransfection Experiments Vector Relative Luciferase Activity pGL3 Basic1 pGL3 + 1.5 kb GRK3 promoter 5-8 pGL3 + SV40 promoter and enhancer300-600

[0183] A comparison of the relative transcriptional function of thevariant GRK3 haplotypes is also contemplated. Reporter constructs aremade for each of the six observed GRK3 promoter SNP haplotypes. This isaccomplished by PCR amplification of the 1.6 kb region from genomic DNAof subjects known to have each haplotype. Each construct is sequenced toverify that it contains the desired haplotype. Each of these constructsis transfected into SK-N-MC cells in triplicate, in parallel with theconsensus haplotype. Luciferase assays are conducted in triplicate,normalized to the expression of a co-transfected β-galactosidase orReizilla luciferase expression plasmid, and compared by analysis ofvariance.

Example 9 Nuclear Protein Binding

[0184] In vitro studies of protein-DNA interaction provide a secondavenue for examining the functional significance of the GRK3 promoterregion SNPs. Specifically, extracts from cells which express GRK3 aresuitable for use in the analysis of differences in DNA binding betweenthe consensus and mutant alleles. The two widely used in vitro methodsfor examining the interaction of cellular transcription factors withpotential regulatory elements in target DNA sequences are DNaseIfootprinting and electrophoretic mobility shift assays (EMSAs). Becausethe present work focuses on the transcriptional effect of four discretesingle base-pair variants in the GRK3 regulatory region, EMSA assays arethe method of choice in these experiments.

[0185] For these assays, 30 base-pair double-stranded oligonucleotidescontaining the consensus and variant form of each SNP are synthesized.The base-pair containing the SNP is centered in the oligonucleotidesequence, and the length chosen is sufficient to provide recognitionsites for a wide variety of monomeric and dimeric transcription factors.The four consensus and 4 variant oligonucleotides are radiolabelled, andEMSA assays performed in the presence of poly-dI/dC, using standardmethods extensively applied to the analysis of neural transcriptionfactors (Gruber et al., Mol Cell Biol 17:2391-2400 [1997]; and Trieu etal., J Neurosci 19:6549-6558 [1999]). It is contemplated that in somecases, for each oligonucleotide sequence, these assays reveal one orseveral protein-DNA complexes, which appear as slower-migrating bands inpolyacrylamide gels. If the consensus and variant GRK3 alleles havedifferent transcription factor binding properties, these are revealed inqualitative or quantitative differences in the pattern of shifted bands.Specific binding is verified by conducting parallel assays in thepresence of a 50-fold excess of unlabelled oligonucleotide.

[0186] In principle, these EMSA assays are suitable for the assessmentof whether the transcription factor pool of any cell type candiscriminate between the GRK3 consensus and variant promoter sequences.Clearly, however, this is only of biological interest in cells thatexpress GRK3 endogenously. Thus, this EMSA analysis is applied initiallyto cellular extracts from SK-N-MC cells, and SK-N-MC cells that havebeen treated with dopamine. Cellular extracts are prepared by previouslydescribed methods (Carter, Biochem Biophys Res Commun 166:589-594[1990]; and Kelsoe et al., Nature 342:238-243 [1989]). Extracts of othercell lines that strongly express GRK3 are also suitable for use in theseassays. The immediate goal of EMSA analysis is to demonstratedifferential binding of transcription factors to oligonucleotidescontaining one or more of the GRK3 promoter variants. To test whetherdifferences in EMSA assays reveal transcriptionally significant effects,a single copy and 3× concatamers of the consensus and variantoligonucleotides are linked to a minimal promoter in the pGL3 reportersystem (Gruber et al., supra [1997]), and compared in transfectionassays in SK-N-MC cells. It is contemplated that examination oftranscriptional activity in this controlled context will revealfunctional differences that are obscured in the context of the entire1.6 kb promoter region.

[0187] It is also possible to obtain some insight into whichtranscription factors account for the mobility shifts of the GRK3derived oligonucleotides through the examination of DNA sequences. Forinstance, SNP 514b alters a predicted binding site for the ubiquitoustranscriptional regulator SP1 (See, FIG. 2). However, the ability topredict specific transcription factor binding sites from DNA sequencedata is presently rudimentary. Recently, biotechnology companies havemade a substantial effort to market antisera to a wide range oftranscription factors (CeMines, Santa Cruz Biotechnology). Theseantibodies can be used to “supershift” EMSA complexes in polyacrylamidegels. By narrowing the list of candidate factors using sequence data,then applying these specific reagents, these methods provide a strongpossibility of identifying the transcriptional regulators that interactwith the polymorphic GRK3 sequences. More general methods, such asexpression screening of phage libraries (e.g., those derived fromSK-N-MC cells), and one-hybrid screening in yeast, allow the cloning andidentification of DNA binding factors identified by EMSA assays forwhich no specific antisera exist.

Example 10 Allele Specific Transcript Quantification

[0188] As discussed in greater detail above, the transfectionexperiments of Example 7 examine the function of a relatively shortregion of the GRK3 regulatory sequence. This determines the functionalsignificance of the polymorphisms within this region. However, it ispossible that additional upstream variants contribute to the phenotypeand are in linkage disequilibrium with the detected polymorphisms. Inthis case, the true functional mutations would be overlooked in sometransfection studies. This intrinsic limitation of the transfectionassays can be overcome if the transcripts from the genomic consensus andvariant alleles can be distinguished in GRK3-expressing cells fromheterozygous patients, yielding an assay of allele-specific geneexpression. In a subset of subjects, SNP 515b, which is very likely toreside within the GRK3 5′-UIR, allows such an assay to be performed.Measuring the ratio of allele-specific GRK3 expression within a cellline also has the advantage of comparison against a naturally occurringinternal control, thereby eliminating differences in expressionresulting from a variety of factors ranging from the subject's medicalor treatment history or age, to transformation by EBV and subsequentexpansion in culture. It is contemplated that SNP 515b affects atranslational regulation element, and that it is a functional SNP. Meansto determine this possibility are provided by the transfection studiesdescribed in Example 7. Even if SNP 515b does affect translationalregulation, the approach described herein is suitable for testing ofadditional differences in transcriptional regulation, as onlydifferences in mRNA levels are examined.

[0189] As discussed in Example 5, GRK3 is expressed in lymphoblastoidcell lines. Thus, cell lines from patients who carry SNP 515b aresuitable for use for allele-specific GRK3 expression. As shown in Table4, the haplotype with variants at sites 514a, 515a and 515b is the mostcommon variant haplotype in the 110 families. At present, 18 subjectsheterozygous for this most common variant haplotype (514a/515a/515b,nine parents and nine of their offspring), one subject heterozygous forsites 515a/515b, and four subjects heterozygous for 515b only, have beenidentified. The use of an allele specific expression assay, based onsingle base pair extension (SBE) is contemplated. mRNA from the cellline being interrogated is DNase I treated, reverse transcribed usingThermoScript (GIBCO BRL) and a GRK3 specific primer, then a 238 bpfragment containing the 515b SNP is amplified by PCR using primersalready proven by sequencing to produce a GRK3 specific product. SBEprimers are designed which terminate one bp proximal to the 515bvariant. Since the variant is a one bp deletion (See, FIG. 2), a singlebase addition using ddCTP and ddGTP labeled with different fluorescenttags adds a G to the wild type allele, but C to the variant that isdistinguishable by fluorescent color. Primers fluorescently labeled bythe single base extension reaction are separated from unincorporatednucleotides and the fluorescent intensity produced by the G vs. C fluorsis determined on an ABI 7700. The ratio of fluor intensities is used toquantify haplotype specific expression. Validity of the system andsignal intensities produced from a true 50:50 ratio of variant to wildtype starting material is determined by two techniques. First, the twospecies of RNA are produced from riboprobe vectors, carefullyquantitated, and mixed at a 1:1 ratio, then tested. Second, the PCR andSBE steps of the system are tested on genomic DNA from homozygous wildtype vs. heterozygous individuals.

[0190] Patient lymphoblastoid cell lines from subjects heterozygous foreach of the three haplotypes are thawed and grown under controlledconditions, so as to assure a similar degree of expansion, cell density(10⁶ cells/ml) and growth conditions. Allele specific expression is thendetermined as described above. Each measurement is conducted intriplicate and differences assessed by ANOVA.

[0191] As discussed in regards to the transfection studies, it ispossible that the effect of some promoter variants will only be manifestwhen the system is challenged to turn on expression. Thus, it iscontemplated that patient lymphoblastoid cell lines provide a systemwhich can be pharmacologically challenged for additional assessment ofpromoter function. Preliminary experiments suggest that lymphoblastoidcell lines do not express dopamine receptors. However, they are wellknown to express β-adrenergic receptors, generate cAMP in response toβagonists, and to desensitize in response to prolonged treatment (Yu etal., Neuropsychopharmacol 21:147-152 [1999]; Wright et al., Ann HumGenet 48:201-214 [1984]). It is contemplated that GRK3 mediates thisdesensitization. If a difference in allele specific expression is notdemonstrated in unchallenged cells, RT-PCR experiments are conducted todetermine if GRK3 mRNA levels are induced by the P agonist,isoproterenol. If so, then further experiments are conducted todetermine the dose response curve and time course, in order to chooseoptimal conditions for maximal stimulation of GRK3 expression. Then, SNP515b is used in similar fashion and with the same cell lines describedabove to examine haplotype specific transcription in pharmacologicallychallenged lymphoblastoid cell lines.

Example 11 Screening for Additional Mutations

[0192] As described in Example 9, it is contemplated that the SNPsidentified in the 1.6 kb upstream region are not the functional SNPsthemselves, but rather in linkage disequilibrium with the actualfunctional variants that are located elsewhere in the gene. The allelespecific expression experiments in patient lymphoblastoid cell linesdescribed above are designed to detect such a possibility. In additionto these functional expression experiments, the identification bysequencing of additional functional variants is contemplated.

[0193] The challenge of such a problem is the large size of the genomicregions that could potentially be involved. Enhancer or repressorelements have been identified in some genes tens of kb upstream fromtranscription initiation. Similarly, many genes with large firstintrons, such as GRK3, have regulatory elements in intron 1. The targetis somewhat bracketed by the negative linkage disequilibrium resultsfrom flanking SNPs. These data indicate that the functional regulatorySNPs are likely between −30 kb and +100 kb of the ATG. However, this isstill an enormous area Thus the use of evolutionary conservation ofregulatory sequence is contemplated as a guide in selecting regions tosequence. Transcriptional regulatory elements are frequently highlyconserved across a wide range of species. Therefore, it is contemplatedthat non-coding sequences conserved between mouse and human in thevicinity of the GRK3 gene reflect conserved regulatory elements, andwill find use in guiding sequencing efforts. Mouse and human genomicsequences from −50 kb upstream of the ATG to 50 kb 3′ of the last exonare compared using BLAST algorithms and by eye. Conserved regions areprioritized based on the degree of sequence conservation, andcorrespondence to known transcription factor consensus sequences in theTRANSFAC database (using the NSITE program on the Sanger Centre webpage). These regions are screened by sequencing in subjects with bipolardisorder using the same approach and methods employed in theidentification of the four promoter SNPs already identified.

[0194] Primers are designed so as to amplify PCR products from genomicregions of approximately 300 bp around each conserved region. Theseregions are amplified from the same 14 subjects studied previously whosefamilies have positive lod scores at the marker D22S419 near the GRK3gene. Fragments are sequenced bidirectionally using the Perkin Elmer BigDye fluor-ddNTP sequencing kit and an ABI 377 sequencer, per themanufacturer's recommendations. Minor modifications are used forsequencing GC-rich regions such as those around the promoter (e.g.,annealing temperature of 54° C. and addition of 5% DMSO). Sequencinggels are tracked and data extracted using ABI sequence analysissoftware.

[0195] Chromatogram files are then transferred to a Sun UNIX workstationfor assembly into contigs using the Pred/Phrap/Consed suite of programsand SNPs will be identified using PolyPhred and by visual inspection.All promoter and exon sequences are visually scanned to evaluatesequence quality, confirm SNPs, and check for possible false negatives(i.e., missed SNPs). Likewise, any regions of reduced sequence quality(<30 on the Pred/Phrap scale, or an approximate error rate of 1:1000)are visually inspected and resequenced, if necessary. SNPs identified inthis fashion are genotyped in the triad sample and tested for linkagedisequilibrium to bipolar disorder. SNPs that demonstrate geneticassociation to bipolar disorder are tested for functional impact usingthe same general approaches described for the promoter SNPS.

Example 12 Screening of Compounds

[0196] In this Example, methods for screening compounds that increasethe expression and function of psychosis-suppressor genes and/ordecrease the expression and function of psychogenes in the basal stateand preferably in the presence of an appropriate agonist are provided.In one particular embodiment, compounds that increase the action of GRK3in both the basal and agonist-challenged states are identified. However,it is not intended that the present invention be limited to compoundsthat impact the function and/or expression of GRK3, as it iscontemplated that the present invention will find use in screening andidentifying various other compounds. It is further intended that thepresent invention will find use with other genes and compounds thataffect their expression. Thus, it is not intended that the presentinvention will be limited to GRK3 and/or dopamine or any otherneurotransmitter, agonist, and/or pharmacological compound (ie., it iscontemplated that any appropriate compound will find use in the presentinvention).

[0197] In these particular experiments, lymphoblastoid cells obtainedfrom normal control subjects, and subjects with bipolar disorder (e.g.,with a genetic defect in GRK3) are grown and maintained as described inExample 5. As these cells express GRK3, adenylate cyclase, and thenecessary G proteins, they are contemplated as being particularly usefulin these methods. The cells are tested “unchallenged” (i.e., withoutdopamine agonist) as well as “challenged” (i.e., in the presence of adopamine agonist). Various concentrations of dopamine and the compoundare tested in each of these experiments. The cells are tested for thelevel of GRK3 mRNA expression, GRK3 protein expression, D1 receptorphosphorylation, and cAMP production. In the presence of the dopamineagonist, compounds of particular interest increase GRK3 mRNA expression,GRK3 protein expression, and D1 receptor phosphorylation, and decreasecAMP production.

[0198] In additional experiments, the cell lines are challenged with atleast one beta adrenergic agonist. Thus, in these experiments, the cellsare tested with various compounds in the presence or absence of betaadrenergic agonist(s), to determine the ability of the test compounds tomodulate GRK3 function. Thus, as with tests including dopamine agonists,in these tests, compounds of particular interest increase GRK3 mRNAexpression, GRK3 protein expression, and D1 receptor phosphorylation,and decrease cAMP production.

[0199] These screening methods need not be limited to lymphoblastoidcell lines. In preferred embodiments, neurally derived cell lines (i.e.,SK-N-MC) are used. In addition, the screening methods of the inventionneed not be limited to measurement of endogenous GRK3. In fact, the usea reporter construct designed to express luciferase or green fluorescentprotein from a GRK3 promoter is contemplated. Such an assay includes adopamine agonist, a neurally derived cell line transfected with a GRK3reporter construct, and the test compound. The effect of the testcompound on GRK3 expression is measured by quantitating light orfluorescence output.

[0200] All publications and patents mentioned in the above specificationare herein incorporated by reference. Various modifications andvariations of the described method and system of the invention will beapparent to those skilled in the art without departing from the scopeand spirit of the invention. Although the invention has been describedin connection with specific preferred embodiments, it should beunderstood that the invention as claimed should not be unduly limited tosuch specific embodiments. Indeed, various modifications of thedescribed modes for carrying out the invention which are obvious tothose skilled in the art and/or related fields are intended to be withinthe scope of the following claims.

What is claimed is:
 1. A method for the identification of genesassociated with psychiatric isorders, comprising the steps of: a)providing test antisense cRNA and control antisense cRNA; b) hybridizingsaid test antisense cRNA and said control antisense cRNA to a icroarraycomprising at least two nucleic acids; c) measuring the hybridization ofsaid test antisense cRNA and said control antisense cRNA to said nucleicacids; d) comparing said hybridization of said test antisense cRNA withsaid hybridization of said control antisense cRNA to provide ahybridization score; e) determining whether said hybridization scoreindicates said test antisense cRNA represents a gene with alteredexpression; and f) determining whether said gene maps to a psychiatricdisorder linkage region.
 2. The method of claim 1, wherein said gene isa human homologue.
 3. The method of claim 1, wherein said gene maps towithin about 10 cM of a putative marker associated with a psychiatricdisorder.
 4. The method of claim 3, wherein said putative markerassociated with a psychiatric disorder has been identified as such inhuman genetic studies.
 5. The method of claim 1, wherein said gene withaltered expression is selected from the group consisting of inducedgenes and repressed genes.
 6. The method of claim 1, wherein saidmicroarray comprises at least one gene chip.
 7. The method of claim 1,wherein said hybridized test antisense cRNA and said control antisensecRNA are labelled.
 8. The method of claim 7, wherein said label isselected from the group consisting of fluorescent labels, luminescentlabels, enzyme labels, and radioactive labels.
 9. The method of claim 1,wherein said psychiatric disorder is selected from the group consistingof bipolar disorder, manic-depressive illness, unipolar depression,major depression, schizophrenia, schizoaffective disorder, and attentiondeficit disorder.
 10. The method of claim 1, wherein said test antisensecRNA is obtained from an animal treated with a dopamine agonist and saidcontrol antisense cRNA is obtained from an animal not treated with adopamine agonist.
 11. The method of claim 10, wherein said dopamineagonist is selected from the group consisting of amphetamine,methamphetamine, cocaine and methylphenidate.
 12. A method fordiagnosing bipolar disorder comprising detecting sequence variation inat least one fragment of a G protein-coupled receptor kinase 3 geneobtained from a subject.
 13. The method of claim 12, wherein saiddetecting comprises nucleotide sequencing.
 14. The method of claim 12,wherein said subject is at risk of developing bipolar disorder.
 15. Themethod of claim 12, wherein said fragment of G protein-coupled receptorkinase 3 gene comprises the promoter of said G protein-coupled receptorkinase 3 gene.
 16. The method of claim 12, wherein said sequencevariation comprises a thymine to cytosine transition at approximately1330 base pairs upstream of the translation start site of said Gprotein-coupled receptor kinase 3 gene.
 17. The method of claim 12,wherein said sequence variation comprises an adenine to guaninetransition at approximately 1306 base pairs upstream of the translationstart site of said G protein-coupled receptor kinase 3 gene.
 18. Themethod of claim 12, wherein said sequence variation comprises a thymineto guanine transversion at approximately 1197 base pairs upstream of thetranslation start site of said G protein-coupled receptor kinase 3 gene.19. The method of claim 12, wherein said sequence variation comprises anadenine to guanine transition at approximately 901 base pairs upstreamof the translation start site of said G protein-coupled receptor linase3 gene.
 20. The method of claim 12, wherein said sequence variationcomprises a guanine to adenine transition at approximately 383 basepairs upstream of the translation start site of said G protein-coupledreceptor kinase 3 gene.
 21. The method of claim 12, wherein saidsequence variation comprises a guanine deletion at approximately 110base pairs upstream of the translation start site of said Gprotein-coupled receptor kinase 3 gene.
 22. The method of claim 12,wherein said sequence variation is predictive of a subject's response toan antidepressant, wherein said response is selected from the groupconsisting of hypomania, mania and psychosis.
 23. A method for screeningcompounds that alter expression of at least one psychiatric gene,comprising the steps of: a) providing: a plurality of cells comprisingpsychiatric genes, standard medium, medium containing at least onedopamine agonist, and at least one test compound; b) incubating a firstaliquot of said cells with said standard medium and said at least onetest compound; c) incubating a second aliquot of said cells with saidmedium containing at least one dopamine agonist and said at least onetest compound; d) quantitating the expression of said psychiatric genesin said first aliquot and quantitating the expression of saidpsychiatric genes in said second aliquot; and e) comparing theexpression of said psychiatric genes in said first aliquot with theexpression of said psychiatric genes in said second aliquot.
 24. Themethod of claim 23, wherein said psychiatric genes are selected from thegroup consisting of psychogenes and psychosis-suppressor genes.
 25. Themethod of claim 23, wherein said quantitating is selected from the groupconsisting of Northern blots, RT-PCR, Western blots, enzyme-linkedimmunosorbent assays, fluorescence immunoassays, radioimmunoassays,luciferase assays, fluorescence assays, and flow cytometry.
 26. Themethod of claim 23, wherein said psychiatric genes are selected from thegroup consisting of the G protein-coupled receptor kinase 3 (GRK3) gene,the D-box binding protein (DBP) gene, the farnesyl-diphosphatefarnesyltransferase (FDFT1) gene, the vertebrate LIN7 homolog 1 (VELIL)gene, the sulfotransferase 1 A1 (SULT1A1) gene, and the insulin-likegrowth factor 1 (IFG1) gene.